Search for scientific publications on the Internet. Part 2. Where and how to search

Continuation (beginning - here )



1.3. Search engines - specialized and not so



In general, search results primarily depend on the task and the correctness of the request. But these results are most often, on the one hand,



a) redundant

and on the other hand, b) incomplete.



Fortunately, both authors and publishers, as a rule, are interested in that information about publications is indexed by search engines, but there are some nuances: indexing of the content of pdf files is not always allowed, and in some cases only certain search engines are allowed to index sites (for example, the largest the domestic electronic library elibrary.ru at one time prohibited indexing of most files for google).



Among other things, the query results depend on the word order and the IP address from which the search is performed.



If we talk about the search for publications, then the question "which search engine to use" has one answer - Google (this is not counting specialized bibliographic search engines, about them below).



First, google indexes the content of the web quite fully. Secondly, a large number of advanced search settings (including those using operators) greatly facilitate the work. Thirdly, as I already indicated, the content of pdf files is indexed by googl even if the pdf consists of images and the text layer is absent in the file.







, . Pander, C. H. (1830). Beiträge zur Geognosie des Russischen Reiches. St.Petersburg, Karl Kray. 150 S.







Google advanced search settings. On Yandex, unfortunately, most of the advanced search settings that were previously available have long disappeared, there are little things like searching by file extension (only instead of Google's filetype: the mime operator is used :)



To search for publications, the most useful are advanced settings and operators that allow you to limit the search to files a certain format (for example, pdf using filetype: pdf), certain sites / domains. For example, if I need to look at which Chinese sites have published publications in pdf format, where ammonites are mentioned, then this query will help: ammonites filetype: pdf site: cn... Well, "+" and "-" are used to indicate required or undesirable terms. For example, when searching for information on cephalopods - ammonites, you usually do not need information about the explosive of the same name or a tribe that once lived in the Middle East and is regularly mentioned in the Bible. Accordingly, the request can be corrected in the following way: ammonites filetype: pdf -explosives -Bible

If you are looking for a specific publication, then it is desirable to put part of its title or the entire title in quotation marks.



It is also important that Google has two separate projects that are directly related to the search for publications:



1) Google booksIs actually a separate search engine that indexes the contents of a huge number of books, magazines, collections and other publications. At the same time, a significant part of publications is available for download in the form of pdf (as a rule, these are old editions, from the beginning of the 20th century and older); depending on the IP, the list of publications available for download may vary significantly, the maximum number of works is available to users from the United States.



Quite a lot of publications are available for viewing in whole or in part. Such works can be downloaded using special programs such as the EDS Google Book downloader or plugins (such as Greasemonkey for Mozilla combined with an automatic file download program such as Download Master ).



And, finally, considerable benefit can be obtained even from the information that is present in publications that are generally inaccessible for viewing in any form except for fragments in several lines ( snippet view ). However, there are two main difficulties with such publications:



a) you can, of course, try to look for such works somewhere else, but the likelihood that they will be available only in the library is quite high.



b) there is a lot of confusion in the names of sources (especially those that were originally not given in Latin), and the information displayed is usually incomplete.



Nevertheless, the information contained in such fragments can be very important and practically not found in other ways.







This is how a typical version of issuing on google books looks like in the snippet view format: as a rule, part of the necessary bibliographic information is missing (the issue number for the journal, sometimes important parts of the title of the publication). It's good if the magazine has 2 issues a year. And if 20? What if the name is misspelled?



2) Google Scholar(Google Academy in Russian). This is a bibliographic search engine that searches well for both the articles themselves and links to them, at the same time allowing you to immediately copy the titles of publications formatted according to popular citation types (APA, Harvard, GOST, etc.). Among the conveniences of this system is the fact that not only publishers' sites are indexed, but also specialized social networks and a variety of sites where scientific papers are often laid out free of charge, and all links to full-text versions are grouped into a single cluster. However, Google Scholar does not index all publications - it is easy to check with the identical search query "keywords" filetype: pdfat Google and Google Scholar. This distinction is especially pronounced with rare keywords.



Well, the most useful feature of google scholar is the ability to subscribe to a variety of notifications (more on this in the continuation of this post)







Issuance of keyword searches on google scholar. Pay attention to sorting options, time range options, and article clusters.



Bibliographic search engines (BPS) oriented to work with publications are now very diverse and numerous. In addition to the above Google projects, the following sites can be noted, which can be considered as BPS:



1) sites that index a huge number of publications around the world. First of all, this is Scopus andWeb of Science , available by subscription (in the case of Scopus, access is also provided to reviewers of Elsevier's journals), as well as the largest site that assigns DOIs to publications ( CrossRef ) or an aggregator of information about publications, grants, researchers, etc. Dimensions .



All of them, except Dimensions, allow you to search for information on a limited amount of data - this is mainly a title / keywords / resume. For the worse, CrossRef stands out here - there the search goes only by name, and with a strict reference to the form of the word. True, CrossRef has significantly more Russian-language publications indexed than in other BTSs from this point, and in addition, this is the most convenient way to solve a problem like “I have a publication name, I need to find its DOI” (all DOIs cannot be found like this the only registrar of digital identifiers for publications, there is also DataCite, for example - but, oddly enough, there is simply no universal service for solving such a problem).





Simple search in Dimensions



Dimensions is a very interesting project that has recently appeared, primarily due to a variety of different settings, a wide coverage of publications (only publications with DOI are indexed, there are still a little less of them than there are on CrossRef) and full-text search. Rather, here you can select different search options (full-text / by resume / by title and keywords). The results can be sorted in a wide variety of ways (date / relevance / number of links / number of altmetrics), and limited by different parameters (source / author / years / subject and much more). Dimensions have different versions (including paid and corporate), only the free option is considered here (we have not dealt with others yet). Separately, you can search for information both by publications,and on databases and grants (the latter option is available only by subscription).







Analytical view , , ( — 2016 2020 ). , , ..



Additional options are offered in the Analytical view tab. They make it easy to understand who is now or in any selected time range is engaged in a particular topic, in which magazines these people write articles and with which co-authors. This is a convenient way to find potential co-authors and reviewers, especially for those who have just started working on a topic and do not have a very good idea of ​​what is being done with it on a global scale. For those researchers who have an ORCID in their articles, the profile contains both this identifier and the Scopus author ID, as well as (if available) the ResearcherID / profile on Publons clinging to them automatically. I repeat - Dimensions is an extremely useful project, and an intuitive one. You can just poke at all the buttons in a row and get into all the tabs.



2) also the sites of the largest international publishers (Elsevier, Wiley, Springer, Taylor & Francis, etc.) and distributors (Ingentaconnect, GeoscienceWorld) of scientific publications can be considered as specialized BTS. However, limiting search results to one or another publisher or distributor is generally not beneficial, and rather can be useful in order to briefly familiarize yourself with a particular topic.



3) to some extent, the BTS functions are performed by scientific social networks ( Academia.edu , ResearchGate ), as well as a "hybrid" of a social network and a bibliographic manager Mendeley (both an offline version as a program and its online version are available); many Scopus options are now available there after Elsevier purchased Mendeley). However, the content of scientific social networks is well indexed by googl, and then it makes sense to regularly browse the update feed in search of something completely new.



4) in a separate category of BTS, regional or specialized sites can be distinguished, where mainly there is data on publications published in any country or several countries (for example, the National Electronic Library elibrary.ru in Russia, the National Institute of Informatics in Japan, the National Library France ), as well as specialized sites dedicated to some specific scientific areas (for example, Biodiversity Heritage Library(BHL))



A characteristic feature of such portals is that they are extremely reluctant to allow third-party search engines to index their content, so if you need to find something French or Japanese, it is more reliable to look at the relevant sites and search there.







Until recently, on the website of the National Library of France, the entire interface was French, until they finally attached there first an English version of the site, and then automatic translation over IP



Separately, it should be said about BHL. This is an extremely useful project for all researchers who are somehow involved in the study of modern or fossil organisms. This library is distinguished by a wide range of sources (including various rarities) and the presence of special search tools (such as a taxon search in the Advanced search tab - if someone collects materials on a particular group of animals and plants, this is a very good way to quickly find publications on topic). Among the shortcomings of BHL, it can be noted that the text layer can often be recognized incorrectly (with the wrong language), as well as the monstrous quality of the default illustrations (the quality of a bad blurry .djvu).



Since image quality is usually of great importance for taxonomic studies, the most correct approach here is to download the required publication in jp2 format, and then to process the files (first reformatting into regular jpg / tiff, then processing ScanTailor and OCR). By the way, all publications from BHL are posted on archive.org, and sometimes it is more convenient to carry out a full-text search exactly on archive.org (this may be relevant in case of searching for any rarities - something interesting may come across here, including those uploaded by users.







An example of output when searching by taxon on BHL







If you need high-quality PDF, it is better to save the file using the "Download Content - Download book - Download JPEG 2000" method, and then process it



And, of course, if you need to find Russian-language publications, you cannot do without searching in the elibrary in combination with cyberleninka . Although the coverage of sources in the elibrary is much greater, we regularly encounter a situation when the elibrary offers to pay for an article - and on the Cyberleninki website the same article is in the public domain.



Despite a number of shortcomings inherent in the elibrary, it seems from birth (the inability to download even open access work without entering a username / password; the lack of an English version and the option to subscribe to certain updates), the search there is quite decent. But if there is a need to regularly track information on Russian-language magazines, it is also worth making a separate directory of links to the sites of the necessary publications - on the elibrary you cannot guess when and why they may suddenly close access to certain publications. And one more thing - in the case when the magazine is not in the public domain and is distributed only for money both through the elibrary and through the website of the publishing house, then on the website of the publishing house, articles can be cheaper (such is the situation, for example, with the journal "Oil Industry" ).







Advanced search settings on elibrary (on the home page of the site - on the top left of the link "advanced search"). The history of previous search queries is also kept here



5) The largest "pirate" projects that provide free access to scientific publications - SciHub and LibGen - can be considered as BTSs , since they have the ability to search by publication title or keywords in one form or another.

And if sci-hub can rather be used as a convenient addition to the search on Dimensions, then rare monographs regularly appear on LibGen, which are not found elsewhere - they are scanned by enthusiasts and posted on LibGen in private.



And finally, it is worth mentioning separately about the search for dissertations. Although many dissertations (both modern Russian and sometimes quite old foreign ones) are posted on the Internet in the public domain and indexed by search engines, it makes sense to look at the VAK website to get information about the latest dissertations that are only planned to be defended . There, dissertations can now be searched by specialty, keywords, date of defense and other parameters (in this case, the search is carried out separately for VAK dissertations, and separately for those that are defended at the councils of organizations that have the right to independently award degrees). But there is a nuance - if you have uBlock Origin installed, then it blocks search on this site.







An example of a search on the VAK website



To be continued.



All Articles