wikipedia semi-tech question

I would give my eyeteeth and all my pocket money right now to be able to search across the text of all revisions of a single page — so search bounded within one page title but across all diffs. Does anyone know of such a thing?

This entry was posted in Uncategorized. Bookmark the permalink.

7 Responses to wikipedia semi-tech question

  1. reddragdiva says:

    I bet the answer is “download a complete history dump and get clever with SQL.”

  2. brassratgirl says:

    sure… I was just hoping someone with, say, toolserver access had already done it 🙂 I am not that clever with SQL (yet!)

  3. kenllama says:

    that’s a good idea, phoebe, why don’t you do that…!

  4. kenllama says:

    hmmm… all implications that you should do this aside, this might be within my technical capacity.

    first big issue i see, though: to build this as a third-party tool means a whole lotta http calls to build the database even for one page. this would make a lot more sense built from the inside — someone with query access to the WP servers. do you have that kind of access? i sure don’t…

  5. brassratgirl says:

    let me introduce you to the great and glorious toolserver 🙂
    http://tools.wikimedia.de/

    but no, I don’t have an account on it or anywhere else. Reddragdiva’s comment that one download a dump for coding it is probably accurate. If you want to get started writing tools/hacks like this, the wikitech list is probably the place to get started…

  6. brassratgirl says:

    eg:
    https://wiki.toolserver.org/view/Query_service

    though not practical for the longterm you could test it out.

  7. kitty_scarboro says:

    ohhh..eyeteeth! i am so there.

    if the tools are easy to write, maybe sasha and i could hack something out.

Comments are closed.