Version 2 consists of version 1 sources + some big lists from RAID forums + famous Antipublic collection.
Antipublic collection is a huge pile of dumps in 4 big parts containing ~74k files.
I go through each file to check if it's in good format and doesn't contain hashed passwords.
In case if file is mixed (contains both hashed passwords and plain-text passwords) I leave it in.
Then all lists get processed through VB.net scripts that remove spaces, check if there are invalid ASC symbols etc.
Then I finish them through linux sort -u command, which sorts them alphabetically removing duplicates.