The latest version of the Stack Overflow Creative Commons Data Dump is now available. This reflects all public data in Stack Overflow up to September 2009.
Download the Stack Overflow Creative Commons Data Dump via BitTorrent
Please note that the Stack Overflow data dumps are now hosted at LegalTorrents! You can subscribe via RSS and be notified every time a new dump is available.
Have fun remixing and reusing; all we ask is for proper attribution.
September 2nd, 2009 at 1:44 pm
Cool. Thanks again for being such a good ‘netizen’ with these dumps.
I’ve got bandwidth to blow, and have setup an archive page for HTTP downloads of both 7za and tar.bz2 dumps. Enjoy.
September 2nd, 2009 at 1:44 pm
The link would help, eh?
http://media10.simplex.tv/content/xtendx/stu/stackoverflow/
September 2nd, 2009 at 4:17 pm
I hope those other dodgy sites don’t just merge this with their old data dump.
I wonder if my question is still front paged?
captcha: hanauer leaving
September 2nd, 2009 at 10:23 pm
Will we ever see a CC dump of SF and SU?
September 2nd, 2009 at 10:41 pm
The actual 7zipped file says so-export-2009-08.7z, think that should be so-export-2009-09.7z?
@Farseeker: Not knowing anything factual, I will guarantee that “will we ever..” is going to be answered with “yes”. :)
September 2nd, 2009 at 11:34 pm
@Bremen: Yea, you found a change that has been discussed on meta.stackoverflow.com. Apparently, this was due to a an error and the dumps will soon be automated. Also, the powers that be are going to name the file (from now on ) for the time frame that the snapshot stops on. E.g.: This September release includes data up to midnight 31 August, so it will have an 8 in it’s file name.
September 6th, 2009 at 1:43 pm
I have imported the datadump into a PostgreSQL database, publicly (anonymously) accessible, at http://www.rdbhost.com/rdbadmin/main.html?r0000000767
Login with rolename ‘r0000000767′ and no authcode.
The data is also accessible through a Python DB API 2 module, rdbhdb, downloadable from:
http://www.rdbhost.com/downloads.html
David Keeney
dkeeney@rdbhost.com
October 8th, 2009 at 5:05 am
Hi Jeff,
Do you plan prepare and upload October dump?