ResearchPilot turns any PDF into a structured analysis, domain classification across 7 fields and 42 sub-fields, an academic-register summary, ranked keywords, and ten recommendations sorted by cosine similarity.
Introduces the Transformer, an architecture relying solely on attention mechanisms, dispensing with recurrence and convolutions. Achieves state-of-the-art translation quality with a fraction of the training cost.
Every capability is independently inspectable: classifications, embeddings, and rankings are exposed as raw values, not just rendered pixels.
PyMuPDF reconstructs title, abstract, body and references from any uploaded PDF. No manual annotation, no template matching. Clean inputs ready for the linguistic models downstream.
7 domains. 42 sub-categories.
Gemini is constrained to a fixed schema: objective, methodology, principal findings.
20 candidates retrieved from the live Semantic Scholar API; Sentence-BERT embeddings re-rank by cosine similarity to surface the 10 most intellectually proximate works to the paper at hand.
A compact, audit-friendly inference graph. Each node fails gracefully, uncertainty is surfaced rather than papered over.
Every dependency is open source. Every component is replaceable.
University of Bahrain · 2026
The repository is public. The demo is live. The pipeline is yours to inspect.