MEpedia and Internet Archive

Discussion in 'MEpedia' started by WillowJ, Mar 3, 2019.

  1. WillowJ

    WillowJ Senior Member (Voting Rights)

    Messages:
    676
    Most pages have not been archived in over a year, and some have not been archived at all.

    I was able to add/update a few to archive.org, but it might be good to set up a method. Possibly some tech person even knows of a way to automate this.
     
  2. daftasabrush

    daftasabrush Senior Member (Voting Rights)

    Messages:
    197
    Sounds good. MEpedia also keeps history of each page but a separate archive is important too.

    It's not clear to me how the web archive chooses which pages to archive. I know it won't just archive the whole of any site.

    Perhaps the Contents page is a key one to archive and might encourage the archive to visit and archive many of the links on there?

    I think there are issues with a low crawl rate on MEpedia too, I've noticed recent pages and not-so-recent changes not appearing from a Google search.
     
    Patient4Life, rvallee and WillowJ like this.
  3. Patient4Life

    Patient4Life Senior Member (Voting Rights)

    Messages:
    213
    I will do some pages. Never worked with it. What do I do?

    EDIT: I just figured it out. Saved main page and a few other pages.
     
    Last edited: Mar 3, 2019
  4. Patient4Life

    Patient4Life Senior Member (Voting Rights)

    Messages:
    213
    Main Page, Pace trial, ICC, CCC, and IOM report, Fibromyalgia, ME, CFS, SEID, CF, and all the Primers have been saved to the Wayback Machine as well a few other pages such as Neuroinflammation and Brain scans.

    EDIT:If this is not the same as the Archive.org page, (although Wayback is found at Archive.org) someone else will have to work on that as I cannot figure out how to use it.
     
  5. Alvin

    Alvin Senior Member (Voting Rights)

    Messages:
    3,309
    I would not worry so much about archiving MEPedia on the Internet Archive.
    I assume MEAction has no plans to take it down.
    On the other hand i think the references in David Tuller's articles desperately need archiving because the authors want to erase any history that they don't want immortalized.

    The IA seems to have some sort of algorithm, i was reading that manually adding a page does not add that page or website to its roster, so i'm assuming it has something to do with worldwide traffic levels or how often its linked elsewhere or something like that.
     
    inox, Esther12 and Patient4Life like this.
  6. Patient4Life

    Patient4Life Senior Member (Voting Rights)

    Messages:
    213
    I just tried the google search console and it will need to be done by someone like Jen or another MEpedia administrator as far as I can see. I guess doing the same pages I did on the Wayback Machine (see above) would be a good idea.

    FYI @JenB @JaimeS

    I also did the Trial By Error, Open Medicine Foudantion, GWI, Simon W, Esther C, Michael S, GET, CBT, and a few other pages on the Way Back Machine.
     
  7. Alvin

    Alvin Senior Member (Voting Rights)

    Messages:
    3,309
    Wow.
    I did the references on the PACE Intimidation article a week ago i think, but there's no way i would be able to do that on a regular basis or even the updates since.
     
    Patient4Life likes this.
  8. Patient4Life

    Patient4Life Senior Member (Voting Rights)

    Messages:
    213
    I can't do anything on a regular basis either. I just can't be responsible for something like this. Oh, and I only did the pages, not their references one by one.

    Just to note here: I also did PEM and Pediatric ME/CFS and List of symptoms of ME CFS. Also Lady Gaga, all the Fibro pages, not just the main page, and Dry eyes syndrome, Lyme disease, and Lupus.

    So, if anyone should ever want to keep updating Wayback, they can look at my list and copy down.
     
  9. Patient4Life

    Patient4Life Senior Member (Voting Rights)

    Messages:
    213
    Those citations are very important. Thanks for doing them.
     
    rvallee and Alvin like this.
  10. Alvin

    Alvin Senior Member (Voting Rights)

    Messages:
    3,309
    I cant be responsible either, i can do a bit here and there but doing it on a regular basis is not going to happen.
    MEPedia is constantly changing so updating it on the IA regularly would be a Sisyphean task

    I think the references are actually the most important, if they get deleted then there is no citation for the MEPedia article.
    In some cases its generic information so no big deal, in some you need citations, especially when its talking about people or actions.
     
    Patient4Life likes this.
  11. Patient4Life

    Patient4Life Senior Member (Voting Rights)

    Messages:
    213
    This link won't update on Wayback and I don't know why. I run into little things like this.
    https://www.bbc.com/news/uk-12195884
     
  12. Alvin

    Alvin Senior Member (Voting Rights)

    Messages:
    3,309
    Patient4Life likes this.
  13. Patient4Life

    Patient4Life Senior Member (Voting Rights)

    Messages:
    213
  14. Patient4Life

    Patient4Life Senior Member (Voting Rights)

    Messages:
    213
  15. Alvin

    Alvin Senior Member (Voting Rights)

    Messages:
    3,309
    Its there too
    https://web.archive.org/web/2013101...cet/article/PIIS0140-6736(06)68662-5/fulltext

    I use Firefox and the Get Archive addon, it allows you to right click on any page and get the IA version. If its not there then you can click add to IA.

    For the lancet one i had to remove the #article_upsell at the end of the url
     
    Patient4Life likes this.
  16. Patient4Life

    Patient4Life Senior Member (Voting Rights)

    Messages:
    213
    Thank you. I guess I am not really familiar with all of this. I will practice.
     
    Alvin likes this.
  17. Alvin

    Alvin Senior Member (Voting Rights)

    Messages:
    3,309
    No worries, i have fought with IA for a while so i have some experience on weird happenings
     
    Last edited: Mar 3, 2019
  18. rvallee

    rvallee Senior Member (Voting Rights)

    Messages:
    13,668
    Location:
    Canada
    I see that robots.txt is not configured. It's a special file that tells search engines about updates and which pages are most important. Without that search engines do a basic crawl and set their own rules based on affluence (more visited sites get more interest, but this only works at much higher rates so for me-pedia it would be minimal).

    This is something that needs to be installed on the wiki software. It's all automatic once it's configured properly. Internet archive probably relies on it, at least partially,

    (I also noticed it's not configured on s4me.info, should be looked into, the forum software should have the option)
     
    JaimeS and Patient4Life like this.
  19. JaimeS

    JaimeS Senior Member (Voting Rights)

    Messages:
    1,248
    Location:
    Stanford, CA
    Thanks guys -- I passed this on to our MEpedia Volunteers Slack channel.
     
    Alvin likes this.
  20. Alvin

    Alvin Senior Member (Voting Rights)

    Messages:
    3,309
    It might also be worth asking them if there is a way that references can be automatically submitted to the Internet Archive.
     
    JaimeS likes this.

Share This Page