
Title: HighWire Press
Publisher: Stanford University Libraries
URL: http://highwire.stanford.edu/
Cost: Partially free
Tested: October 20-November 1, 2009
HighWire Press remains the best host of the digital collection of scholarly publishers who want to offer access to the digital versions of their journals, but don't have the skills, resources and/or interest to do it on their own.
HighWire Press has not only the largest full-text searchable, hosted collections of more than six million scholarly articles, but also the largest free subset of them, with nearly two million open-access items.
This well-designed service also has feature-rich software, and detailed, accurate, informative statistics about the features of the hosted collection — as well as some limitations.
It was about 15 years ago when I first used the services of HighWire Press and it was love at first sight. Not only because it had about 100,000 open-access articles, but also because of the sophisticated features of the software in linking among cited and citing journal articles hosted by Highwire Press. I wrote about HighWire Press more than four years ago in this column and it has been a recurring subject of some of my other articles, such as the one in Information Today in 2003. It was also on the list of Peter's Picks and Pans before — as a pick, of course.
There is a good reason that I have remained an enthusiastic user of the service. The journal base kept growing and so did the set of software features, such as the very valuable clues —implemented in the archives of several HighWire Press hosted journals — about how many times the article has been cited by journals hosted by HighWire Press or covered by Thomson's Web of Science, or Elsevier's Scopus, or by journals participating in the CrossRef project. I intentionally used OR instead of AND because this valuable score is available for most of the journals only from one or two of the databases mentioned above.
There are some other similar services, that I refer to as digital facilitators (which is disliked by HighWire Press) because they facilitate for publishers the very demanding process of providing access to their journals for their subscribers and selling articles for the non-subscribers. Having expensive and powerful software is not enough, as was so clearly demonstrated by the inferior implementation of the digital collection of Kluwer Academic using the Verity software very poorly. Wiley has not been able to bring the most out of that software in implementing its InterScience service either.
Only MetaPress is in the same league among the digital facilitators as HighWire Press. Actually, it hosts more journals than HighWire Press, but the total number of articles in MetaPress is below five million records (which is very fine). In my searches I see fewer often open-access articles than in HighWire Press and its software features are not as sophisticated as those of HighWire Press.
IngentaConnect has been one of the competitors of HighWire Press (especially after it acquired the Catchword software, but did not make use of its more sophisticated features). Its most significant shortcoming is that despite (non-public) promises, it still does not offer full-text searching of the hosted articles, which is simply beyond me. The same is true for the very modest Scitation service of the American Institute of Physics whose beta version I reviewed last month in this column. Atypon is the latest competitor, but it has much smaller digital collection, much fewer and smaller publishers and far fewer open-access articles than the first three digital facilitators mentioned above. Allen Press is far by the smallest digital facilitator, hosting about 30 journals with elementary software options.
HighWire Press tells much about its content right on the homepage. Its numbers seem to be right and my test searches closely matched them. The slight differences may be caused by the fact that I re-did my test on November 1 and the homepage introduction may have been from a few days earlier. It facilitates the digitalization of more than 1,277 journals of 140 publishers and it does show the list of the 1,277 journals and nearly 100 books — such as the Red Book if you care to count the entries as I did.
Highwire Press made its name by pioneering and facilitating the digitization of biomedical journals. No wonder that there is still an entry in the journal list with the title of HighWire Library of the Sciences and Medicine (but it just takes you to the basic search page of Highwire Press). The biomed journals — along with physical sciences journals — still represent the majority of its content, but journals in the Social Sciences have also become significant and to a lesser extent so did journals in Arts & Humanities. Going to the list of journals hosted by HighWire Press and expanding the list of subject areas within the major disciplinary areas shows the impressive number and variety of journals across all disciplines and sub-disciplines as displayedin the field of criminology here. Two publishers contribute the most to the breadth of the scope of coverage: SAGE with 550 journals and Oxford University Press with 254 journals. Oxford is also one of the publishers that provides the most open-access articles, while SAGE's contribution is minimal in this regard.
HighWire Press reports that it has 6,113,416 full-text articles and 1,951,014 of these are free articles. Both data are lower than my test search produced in which I extended the publication time period to December 2009. The PR data may reflect the status with the October, 2009 date and it clearly refers to articles. My totals are higher (6,297,623) as I included also the number of book chapters.
What is relevant is that for so many scholarly journal articles, the full text is searchable in one fell swoop and the results are reproducible.
HighWire Press seems to have accurate bibliographic data for nearly 6.3 million scholarly papers and book chapters — along with close to 3.4 million abstracts for journal and conference papers freely available from widely and highly respected scholarly journals (and an increasing number of books) hosted by HighWire Press.
This is in sharp contrasts with the millions of records of articles and books that Google Scholar attributed to ghost authors, replacing the names of real authors by words that the brain-damaged parsers of Google Scholar extracted from article section titles, author affiliations, search menu options and fancied for author names, such as the ones I illustrated in my short piece on the online site of Library Journal Academic Newswire a good month ago. Google Scholar developers quickly remove some of the most ridiculous records that I use for illustrations in my reviews and presentations, but this is usually just a PR Band-Aid not a sincere effort to fix the metadata mega mess in Google Scholar. Neither does it help all those authors who were robbed of their authorship and citation counts in the past five years when their productivity and impact factor were calculated in decisions of tenure, promotion and grant applications using Google Scholar. This remains a large-scale problem that I will discuss in a forthcoming column.
HighWire Press's publisher clients are among the best-known names in the scholarly world, such as Oxford University Press, SAGE, American Academy of Neurology, American Academy of Pediatrics, American Association for Cancer Research, American Cancer Society, American College of Cardiology, American Diabetes Association, American Heart Association, American Medical Association, American Physiological Society, American Society for Microbiology, the British Medical Journal Publishing Group, Geological Society of America, Geological Society of London, Lippincott, Williams and Wilkins, Massachusetts Medical Society and Society for General Microbiology.
One of the newest partner, the Royal Society, added an extra clout to HighWires Press and on the side an extra Schadenfreude bonus as it was formerly a customer of MetaPress. Then again, there also are partner losses, such as that of Annual Review, Inc. which moved to the Atypon platform, becoming the most highly ranked publisher in the Atypon stable.
Even a single journal can be a big attraction for many users, because of the clout of that publication, such as Science magazine, or the Proceedings of the National Academy of Sciences in HighWire Press. Many of the hosted journals also are widely known beyond their disciplinary areas, such as the New England Journal of Medicine, Blood, Chest, Circulation, the FASEB Journal, Heart, Gut or BMJ.
It also is a telling sign that nearly 700 of the journals hosted by HighWire Press are among the journals monitored by Thomson-Reuters for the Journal Citation Reports. It also is quite telling for the clout of the journal stable of Highwire Press, that 38 of the 100 most influential biomed and natural history journals —according to members of the BioMedical & Life Sciences Division of the Special Libraries Association, are hosted by HighWire Press.
Beyond being a sophisticated digital host, HWP stands out of the group of digital facilitators also because of the volume of open-access full-text articles. This varies from publisher to publisher and often from journal to journal. The exceptions are those publishers that make available the entire digitized segment of their journals (or books as is the case with the Education Book of the American Association for Cancer Research) hosted by HighWire Press. There are 46 such sites and they are easy to find by browsing through the list of freebies where the "free site" label identifies the journals/books.
The most common practice is to make available open-access all the articles one year after the publication date. In some cases the delay may be longer, as 18 months with the Agronomy Journal or Crop Science, 24 months with Biostatistics, or Briefing in Biostatistics, 36 months with Health Affairs or Laboratory Animals, or 10 years as it is for The Journal of Bone and Joint Surgery. To its credit, JBJS pre-2000 articles are free back to its first issue in 1889, which means that 17,729 articles (75%) are free of charge out of the total of 23,675 papers in the American edition of this journal. If the search is extended to both the British and American editions the number of open-access papers increases to 26,593 (76%) out of the total 35,000.
Then again, there are journals that have shorter delays, such as the 29 journals with six months delay and four with merely three months delay, such as Diabetes and Diabetes Care. The pattern is not predictable as the same publisher may have very different policy for its different journals. For example, the American Diabetes Association has — beyond the above two journals with three months delay for open-access-, two immediately open-access journals, Clinical Diabetes and DOC News and one, Diabetes Spectrum, with six months delay.
There also are journals that offer open-access to only certain sections even after the moratorium and require registration, such as the Archives of Facial Plastic Surgery, Archives of General Psychiatry, Archives of Internal Medicine and several others. (In the examples above, the word "Archives" is part of the journal names, not a reference to the concept of digital archives).
It is not surprising that this is a sophisticated software, as these guys have created the software platform for the Oxford English Dictionary (which is quite a daunting task), so dealing with the idiosyncrasies of the numerical-chronological designations, the title-subtitle-parallel title combination variations may have been a walk in the park for them.
The interface is self-explanatory both on the Quick Search and Advanced Search templates. The help file is comprehensive (except for not discussing at all my favorite "cited by" function ), clear and uses good illustrative examples and explanations, such as for stemming the search terms behind the scene, but I think that the promises about author affiliation and citation searching by keywords are too optimistic in practice.
The output features also are good. I like the quick flip-flopping between the standard and the condensed formats of the result list and the highlighting of the matching words. I would like to see more than the two sort options (relevance and reverse chronological order), such as by journal name, author and most importantly sorting by absolute citedness and relative, per-year citedness.
Listing the 50 most-read and most-cited articles of a journal is an excellent feature. It is clearly explained that the most read list is calculated from the full-text views (in pdf or html format) and would be better described as most viewed as I know from my practice that often I start reading a paper that I displayed, but find out quickly that it is too theoretical for my taste or is just beyond me and don't read it further.
Similarly, the most cited list is ranked by the number of times the article was cited from the journals in the most recent month in journals hosted by HighWire Press. Both are updated monthly, but not necessarily at the turn of the month.
The variety of follow-up services offered by HighWire Press on a particular article are outstanding as shown on this sample. One may ask to be alerted when the article is cited, or corrected, to e-mail the article to someone else (not necessarily a friend as the option says), to find similar articles in the same journal, or in PubMed, or WoS (for WoS subscribers), to add it to marked and saved citations (it means saved records here), to download to citation manager (a bibliography management software) and to engage in a variety of social bookmarking activities under the pretense of an article to the user heart's content.
In some journals there is one or more citedness count related option(s) displayed in the side-bar when the abstract or the record are displayed. I will cover this potentially precious feature in more detail below.
There are three additional features that I would like to see implemented/corrected in the software. The first is to have a check box on the search template to limit the search to records that do have an abstract. This is a lighter restriction than searching in the Title/Abstract field and offers a more informative result list displaying part of the abstracts (or the first 150 words, or 20% of the text) whichever is shorter, if there is no formal abstract in the paper.
The second is an option to type in the journal name directly, instead of scrolling down on the very long journal names list to choose and mark the journal. (This is a good feature when the user can't remember the name of the journals exactly or wants to search several journals simultaneously, but in case of simple and unique journal names, such as Acta Sociologica, African Affairs, Angiology, Autism, Bioinformatics, Biometrika, Interfaces, etc. — it would be more convenient just to type in the journal name in a query cell directly.
Then again, it may alleviate this problem that users can create a list of their favorite journals or the ones they or their library has subscription to from the HighWire Press collection. For example, if I were most interested in finding articles in HighWire Press from the Journal of Information Science, Journal of the American Medical Informatics Association (which appears twice on the list, once — I believe- erroneously as a soon forthcoming title of the British Medical Journal Group) , Journal of Librarianship and Information Science, IFLA Journal, Information Development, Information Systems Research, International Journal of Law and Information Technology and Health Informatics Journal, I can create and/or update a My Favorite Journals list and choose the option to limit the search to only those journals.
I am not advocating this as highly relevant articles on my major research topic also are published in other journals, but it may simplify the search when you search about circulation control (in the library, not in the emergency room), or citation analysis (not at the police station but in the ivory towers of academia). This can be part of the broader personalization options offered under the My HighWire Press tab.
The third feature is the most important from my perspective as it relates to the citedness information of articles. With 6.3 million records for full-text scholarly papers this is really a feasible option to correct and enhance. There is a feature of HighWire Press that shows which articles in HighWire Press cited a paper, plus —depending on the journal and the article - an option to find out by a click of a button how many times it was cited by journals in CrossRef, Web of Science (WoS) or Scopus - sometimes in not just one but two of the databases. (Yes, this feature also is available for Google Scholar along with author search, but I grew very concerned about the misinformation dispensed by Google Scholar, so I can't get excited about its inclusion).
This feature is best implemented for WoS because the number of times an article was cited in WoS is displayed automatically and directly in the sidebar. In the case of Blaise Cronin's articles above, his paper "Bibliometrics and beyond" HighWire Press shows that it was cited 73 times in WoS and provides a link to show -for WoS subscribers- the citing articles. This is a great example of making good use of the synergy between HighWire Press and WoS.
Strangely, there is no number displayed for the number of citations from papers in journals hosted by HighWire Press — just a link to the part of the very same page which usually is right under your nose. In this case there are 12 articles citing Cronin's paper so they are easy to count, but this count should appear in the sidebar, as these are citations given to the article from journals hosted by HighWire Press. There is no citation count next to the Scopus link. It should work exactly as WoS does. I kept getting the "record not found message" when I clicked on the Scopus link which is quite a logical link http://jis.sagepub.com/cgi/scopus/27/1/1 with URL of the sending journal site (JIS for Journal of Information Science), the recipient service name (Scopus) and the volume, issue and page number of the target article.
The message returned is this. This is wrong and frustrating because when I enter the query the record is found — indicating that the article was cited 84 times in journals covered by Scopus. I got this reply for dozens of the sample articles, so there must be some technical problem in interpreting the query string in the link. This is bad for the goose and bad for the gander and for the author and for the journal and must be corrected because there are more and more journals that are supposed to get citedness data only from Scopus but not WoS and this gaffe undermines the credibility of the excellent concept and the quality of both services.
Some of the journals hosted by HighWire Press, such as Pediatrics, the American Journal of Neuroradiology, Blood, Journal of Cell Biology, Journal of Experimental Medicine. This sample record from Pediatrics, for example, offers links to learn the citedness of the article — among others- in journals which participate in the CrossRef project. It is the clearinghouse for the very successful DOI-initiative which uniquely identifies journal articles by a Digital Object Identifier. HighWire Press should show the citation counts right in the side-bar table, but it never does. In this case it is difficult to count the citing articles as there are so many and the process interrupts the train of thoughts and the search process. The paper was cited by 47 articles in journals hosted by HighWire Press and by 48 articles in journals participating in the CrossRef project. Users would not count these citations.
If they were automatically and directly reported next to the links, as is the case when citation counts from journals covered by WoS are reported, it would implicitly demonstrate how good is the mix of journals in HighWire Press which came up with nearly identical citation counts from its much smaller journal group than CrossRef has — so it would increase the clout of the digital facilitator.
In many cases, the citation count would be the best and most intelligent silent advertisement for HighWire Press. In addition, HighWire Press has a visually much more pleasing list of the citing articles, indicating if any of the citing articles are available in an open-access format. (None of them were open-access in this case, but this is rather the exception than the rule).
The similarly structured but more redundant links like this http://jis.sagepub.com/cgi/external_ref?access_num=http://jis.sagepub.com/cgi/content/abstract/27/1/1&link_type=GOOGLESCHOLAR worked for Google Scholar in this case (but not in some other test examples), reporting 133 citations received by Cronin's article. It is another question that one may wonder how the purportedly 1990 book manuscript of Borgman and Furner could cite Cronin's 2001 paper.
This is a surprise only for casual users who don't know that for Google Scholar anything that has four digits (such as a phone number, fax number, street address, page number) may look like a publication year, even if the top of the page of a Borgman and Furner manuscript clearly indicates that this is a manuscript from 2002.
Rational Google Scholar searchers know that this is the least problematic feature of the grossly undereducated and unintelligent crawlers and parsers of Google Scholar from a bibliometric/scientometric perspective. Both its bibliographic metadata elements and citation counts are often utterly absurd as you can see from some illustrative excerpts of my lecture tour in June, 2009 for university librarians in Australia and New Zealand.
This is an excellent service both for publishers and end-users (subscribers and non-subscribers alike). Apparently, it has so many good ideas (often copied by its competitors) that it does not have enough manpower to implement and verify all of them systematically as illustrated by one of the most attractive, but often malfunctioning, citation count reporting features. The question is not who is at fault for the non-working and potentially very valuable links to Scopus. Both parties should work on fixing and monitoring it, since it represents the future of smartly assisted searching for information. HighWire Press has paved the way for those who don't want to have the trouble, but would enjoy the benefits.