Home / Case Studies / How We Rebuilt a Research SaaS Used by 200,000+ Academics
SaaS / Architecture / Web Design

How We Rebuilt a Research SaaS Used by 200,000+ Academics

Paperpile was an existing research platform built on PHP that could not scale to their growth trajectory. We redesigned the full architecture in Figma, rewrote the backend in Node.js, migrated the database to MongoDB, and rebuilt the frontend in React — while maintaining 100% feature parity. The result: 200,000+ active users, 95% performance scores, and 3× faster than the original.

Year 2025
Duration 18 months
Services
Saas Development Web Design
Stack
FigmaMongoDBReactNode.jsHTMLJavaScriptCSS
Kerim Alihodza
Kerim Alihodza CEO & Business Mechanic · 2025
200k+
active researchers worldwide
95%
Core Web Vitals performance score
faster than the original PHP version
5★
verified client rating
View live site
How We Rebuilt a Research SaaS Used by 200,000+ Academics

TL;DR

Paperpile had an existing PHP research platform with a real user base. The architecture could not scale to where they needed to go. We redesigned the full UI system in Figma, rewrote the backend from PHP to Node.js, migrated to MongoDB, and rebuilt the frontend in React, all while keeping the live product running. Eighteen months later: 200,000+ active users, 95% performance scores, 3× speed improvement, 100% feature parity maintained throughout.

The Situation

Paperpile was already a functional product with an active user base of researchers and academics. The platform handled PDF management, citation generation, collaborative editing, and browser extension integration, a technically complex set of operations built on PHP over several years.

The problem: the existing architecture could not support the user growth they were targeting. Adding features was getting slower and more expensive. Performance was degrading as the user base grew. A complete architectural rewrite was necessary, but it had to happen without breaking a product that real users depended on daily.

The Challenges We Solved

Challenge 1: Design a Component System That Works Across Web, Mobile, and Extensions

The PHP interface was functional but inconsistent. Different pages had different interaction patterns, the mobile experience was poor, and there was no shared design language between the web app, mobile app, and browser extension.

What we built in Figma:

  • A complete design system with shared component library (buttons, forms, cards, modals, navigation patterns)
  • Responsive design specifications for every component across breakpoints
  • Interaction states documented for every interactive element
  • Accessibility annotations for keyboard navigation and screen reader compatibility
  • Design documentation that the engineering team could implement without ambiguity

The Figma design system became the single source of truth for all visual decisions throughout the 18-month build, ensuring consistency across every platform.

Challenge 2: Architect a Backend That Can Scale to Hundreds of Thousands of Users

The PHP backend was a monolithic architecture that did not separate concerns cleanly. Adding a new feature often required touching multiple unrelated parts of the system. Scaling one component meant scaling everything.

Backend architecture we built (Node.js + MongoDB):

  • RESTful API design with clearly defined service boundaries
  • Microservices structure allowing independent scaling of high-traffic features (PDF processing, citation lookup, collaboration sync)
  • MongoDB schema designed for research document storage, flexible enough for varied citation formats and performant at the query patterns the product required
  • Multi-layer caching strategy reducing database load for frequently accessed data
  • Security architecture protecting research data with encryption at rest and in transit

Frontend architecture we built (React):

  • Component-based structure with shared component library matching the Figma design system
  • State management for complex real-time collaboration features
  • Client-side routing for smooth navigation without full page reloads
  • Code splitting to reduce initial bundle size
  • Progressive enhancement ensuring functionality across browser versions

Challenge 3: Migrate Years of Business Logic Without Losing Functionality

Years of PHP code contained product logic that was not fully documented. A direct rewrite without careful mapping risked introducing regressions that would break workflows researchers depended on.

Our migration approach:

  1. Feature audit: documented every function in the existing PHP codebase with expected input/output behavior
  2. Test suite built against the existing system to define “correct behavior” before touching any code
  3. Feature-by-feature rewrite in Node.js, validated against the test suite at each step
  4. Parallel running: new components operated alongside old ones until validated
  5. Staged cutover: users migrated in cohorts, with rollback capability at each stage

The result: 100% feature parity on launch day. No user-facing regressions throughout the 18-month project. Zero forced downtime.

Challenge 4: Performance Engineering for a Data-Heavy Research Application

Paperpile users work with thousands of PDFs and complex citation databases. The PHP version was slow, with sub-par load times for large libraries and noticeable lag in collaborative editing sessions.

Backend performance work:

  • MongoDB indexing optimized for the specific query patterns of research document retrieval
  • Async processing for PDF analysis and citation extraction, so heavy operations run in the background without blocking the user interface
  • CDN integration for static assets, with files served from edge locations closest to the user
  • Load balancing configuration for horizontal scaling as user count grows

Frontend performance work:

  • Code splitting: JavaScript bundles load only what each page needs
  • Lazy loading for images, PDFs, and off-screen components
  • Bundle optimization reducing total JavaScript payload by 60%
  • Core Web Vitals monitoring integrated into the deployment pipeline

Outcome: 95% performance scores consistently, sub-second search across millions of documents, collaborative editing without refresh lag, 3× faster than the original PHP version on equivalent operations.

Results

MetricPHP VersionMERN VersionChange
Active usersBaseline200,000+Significant growth
Performance scoreBelow threshold95%Industry-leading
Search speedSlowSub-second3× improvement
Feature additionsSlow, expensiveFast, modularArchitectural benefit
Platform supportWeb onlyWeb + Mobile + ExtensionsExpanded

What Made the Difference

Feature-by-feature migration, not a big-bang rewrite. The most common failure mode in platform rewrites is attempting to rewrite everything simultaneously and launch all at once. Every component that can be isolated and validated independently reduces the overall risk. We isolated 23 distinct feature areas and validated each one before proceeding.

Performance built into the architecture, not bolted on afterward. Performance optimizations added after architecture decisions have limited effect. The ceiling is set by the architecture itself. Building caching, async processing, and database indexing into the design phase rather than the optimization phase meant there was no artificial ceiling to work around later.

Design system first. Building the Figma component library before writing any code meant that engineering decisions could reference a shared visual language from day one. Inconsistencies that appear in design are easy to fix. Inconsistencies discovered during engineering review are expensive.

Contact us to discuss your requirements →

AI Customer System Agency · 50+ Agencies Served

Stop Losing Leads
Your Ads Already Paid For.

Book a free 30-minute audit. We map your current lead flow, calculate your exact revenue leakage, and show you the precise AI configuration for your agency, at no cost, no obligation.

No pitch unless you ask
Custom ROI estimate on the call
Response within 4 hours
8 audit slots per month, agencies only