Very Good Passive Vocab Strategy

Recently found this:


My method is:

  1. Download the subtitles using this greasemonkey script
  2. Drop the subtitle file into CTA
  3. Copy the unknown words into Purple Culture’s vocab list generator
  4. Copy the resulting list into Google Docs
  5. Download as CSV
  6. Import into Anki
  7. Add audio with AwesomeTTS

This may seem like a lot of steps but it only takes a couple minutes. I wouldn’t add all the unknown words, just enough to get to 90% or more comprehension. There’s a few more steps in there that are specific to my process, like adding tags, but that’s the basic process.


Basically all you need is the following:
Step #1 there are greasemonkey scripts you can downlaod for Disney Plus and Netflix (probably youtube, etc)

  1. Get access to legacy skritter website and download all your words in an excel
  2. Import your excel words into Chinese Text Analyzer
  3. Download the subtitles using this greasemonkey script
  4. Save as a CSV
  5. Drop the subtitle file into CTA

Basically Chinese Text Analyzer then looks at the words you know and the words in the subtitle text and the words you don’t know “fall out”.
It then ranks the words based on frequency in the movie from highest spoken to lowest spoken. It will also give you feed back and will tell you “you know” 92% of the words in this script, etc.

I haven’t found a more powerful tool when it comes to being able to just “relax” and watch a movie in chinese, knowing that I’ll just take 2 seconds to get all of these statistics and words/chengyu that I don’t know from the software. So the passive vocab strategy is basically just being exposed to a bunch of words by watching anything, knowing you can make it as active as you want with your next rewatch by processing however many new words you think necessary.

I’m currently using Wenlin dictionary to determine “relative frequency”, however I was hoping someone had either a website or an excel file, where you can easily look up the frequency of the word. I know of some that are out there, like the one below.

But these files are confusing. For example many words are listed as ranked “11” like 50 times. I’m sure someone would understand that. But in liue of this was hoping to find a straightforward excel with maybe like the top 50,000 words. So 1) I can vlookup the file without breaking my spreadsheet 2) I can understand how the words are ranked.

For anyone interested:

If you click into each file, you can download a CSV file. Then convert them to excel and copy and paste them to create one list. This is good for 1) “vetting” new words to see how frequent they are, before bothering to learn them 2) You can find the gaps in your vocab words by doing a vlookup function on the 56k list, any gaps should fall out.

Don’t know what it is based on but “zero to hero” is a big name in the language learning community.