I've found longformers on huggingface that are mated to a tokenizer and trained that look usable but for most of the long range transformers that seems aspirational now.
If you want to understand how to evaluate search engines the best resource is this book
Basically you make a set of queries, then you make a list of judgements to the effect that "document A (is|is not) relevant to query B"; the code in that github repository is used by conference-goers to merge those judgements with search results to make a precision-recall curve. You can download document collections and judgements from the TREC website to get started but what works for their collection may or may not work well for yours.
The story of that TREC conference was that "99% of what you think will improve your search results won't", and the BM25 algorithm was the first great discovery that came out of it after 5 years of disappointment. I learned to be skeptical about any kind of "break a document up into subdocuments and rank the subdocuments" because that was one of the many ideas that people struggled to make work back in the day.
There definitely are ways to look at documents a piece at a time systematically for particular tasks and sometimes simple answers will work ("Who won the sports game?" is almost always answered at the beginning of a newspaper article.) Most of the simple ways of doing it people try (like averaging vectors) are like having Superman huff some Kryptonite before you test him though.
If you want to understand how to evaluate search engines the best resource is this book
https://www.amazon.com/TREC-Experiment-Evaluation-Informatio...
Basically you make a set of queries, then you make a list of judgements to the effect that "document A (is|is not) relevant to query B"; the code in that github repository is used by conference-goers to merge those judgements with search results to make a precision-recall curve. You can download document collections and judgements from the TREC website to get started but what works for their collection may or may not work well for yours.
The story of that TREC conference was that "99% of what you think will improve your search results won't", and the BM25 algorithm was the first great discovery that came out of it after 5 years of disappointment. I learned to be skeptical about any kind of "break a document up into subdocuments and rank the subdocuments" because that was one of the many ideas that people struggled to make work back in the day.
There definitely are ways to look at documents a piece at a time systematically for particular tasks and sometimes simple answers will work ("Who won the sports game?" is almost always answered at the beginning of a newspaper article.) Most of the simple ways of doing it people try (like averaging vectors) are like having Superman huff some Kryptonite before you test him though.
Look up my profile and send me an email.