|
 |
Paper: |
Improving astroBERT Using Semantic Textual Similarity |
Volume: |
538, ADASS XXXII |
Page: |
305 |
Authors: |
Felix Grezes; Thomas Allen; Sergi Blanco-Cuaresma; Alberto Accomazzi; Michael J. Kurtz; Golnaz Shapurian; Edwin Henneken; Carolyn S. Grant; Donna M. Thompson; Timothy W. Hostetler; Matthew R. Templeton; Kelly E. Lockhart; Shinyi Chen; Jennifer Koch; Taylor Jacovich; Pavlos Protopapas |
DOI: |
10.26624/TGMX1534 |
Abstract: |
The NASA Astrophysics Data System (ADS) is an essential tool for researchers that allows them to explore the astronomy and astrophysics scientific literature, but it has yet to exploit recent advances in natural language processing. At ADASS
2021, we introduced astroBERT, a machine learning language model tailored to the
text used in astronomy papers in ADS. In this work we announce the first public release of the astroBERT language model, show how astroBERT improves over existing
public language models on astrophysics specific tasks, and detail how ADS plans to
harness the unique structure of scientific papers, the citation graph and citation context,
to further improve astroBERT. |
|
 |
|
|