Author
Listed:
- Yuji Fujita
- Noritaka Usami
- Toshiaki Fujii
- Hiroaki Nagai
Abstract
Precision and recall are useful indices to evaluate an operation, algorithm, system, and other subjects from two different facets. However, they are not readily available when the subject is still in progress because the truth set, which is required to calculate recall, is unknown. In this study, a method to predict the size of the truth set of an inquiry still in progress is presented, which consists of a classical 18th century mechanics found and formulated by Isaac Newton, today known as “Newton’s cooling law”, with some set-theoretical trick and executed by Markov Chain Monte Carlo. The developed method is applied to nation-wide scale collections of identifications of the authors of academic articles as the affiliation data of Japanese national research organizations, and obtain recall values, as a part of objective, evidence-based policy for science and technology of the Japanese government. The author identification result is naturally represented as a bipartite directed graph, from the set of authors to the set of affiliation data. We conduct a sort of network prediction, not on the bipartite graph itself but on its vertices size and obtain the true graph size by using a simple and straightforward probabilistic model, which is implemented by also a classical, yet recently developing probabilistic inference method.Author summary: In this study, we propose a method to predict the value of recall of the unfinished work, for example, survey or inquiry still in progress. It may sound strange to discuss recall or precision on a subject still going on, yet, they are needed in the real world applications because we want to know how much more we can expect from the unfinished inquiry. The trick is to predict the size of the truth set, not the set itself. If we observe identical subject multiple times, the observations must be slightly different from each other, depending on how much left to be done. If the results are almost equal, the observations are good approximation of the truth set, and the recall should be high. Conversely, disagreement between the observations is a sign of low recall. The method consists of simple and classic early eighteenth century mechanics found and formulated by Sir Isaac Newton and known as “Newton’s cooling law” today, some mathematical tricks for data preparation, and Markov Chain Monte Carlo. We apply the method to nation-wide scale collections of article authors identification as the affiliation data of Japanese national research organizations, and predict how many researchers are left to be identified as authors. As the identification relation from an author to the affiliation data of a researcher is naturally represented as a directed bipartite graph, a form of network-prediction on a directed bipartite graph is executed in this study.
Suggested Citation
Yuji Fujita & Noritaka Usami & Toshiaki Fujii & Hiroaki Nagai, 2024.
"Truth set size prediction by Newton’s cooling law,"
PLOS Complex Systems, Public Library of Science, vol. 1(4), pages 1-16, December.
Handle:
RePEc:plo:pcsy00:0000020
DOI: 10.1371/journal.pcsy.0000020
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcsy00:0000020. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: complexsystem (email available below). General contact details of provider: https://journals.plos.org/complexsystems/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.