Component Design

A robust and scalable architecture is crucial for the project. A microservices-based architecture would be a suitable approach, as it aligns with the principles of the AWS Well-Architected Framework we’ve already considered.

  • Client Applications: These are the front-facing parts of your system. A key takeaway from the Dropbox design is to make clients smart.​
    • Passenger Mobile App: Should handle offline modes and synchronize with the backend when connectivity is restored. It will communicate with the backend via the API Gateway.
    • IoT Devices (on buses): These clients should be able to buffer data locally in case of network interruption and send it in chunks to the ingestion service once reconnected. This “chunking” approach is inspired by the Dropbox design and optimizes data transfer.​
    • Admin Dashboard: Web-based clients that interact with the system through the API Gateway.
  • API Gateway: This remains the single entry point for all client requests. As suggested in the URL Shortener design, it would expose public-facing REST APIs for the clients. For communication between the internal services (RPC optional).
  • Data Ingestion Service: Responsible for receiving the high-volume stream of location data from the buses. It will validate the data and pass it to a message queue for asynchronous processing, a pattern that enhances reliability.
  • Metadata Service: A crucial component inspired by the Dropbox design. This centralized service will manage all the metadata in the system. This includes user profiles, bus routes, bus stop information, and pointers to the raw IoT data stored elsewhere.​
  • Real-time Processing & ETA Service: This service consumes data from the message queue, processes the real-time bus locations, and calculates ETAs. It will need to query the Metadata Service for route and stop information.
  • “Digital Hail” Service: This service manages the state of hail requests. It could use a fast key-value store to map a hail request to a specific bus and stop, similar to how a URL shortener maps a short URL to a long one.​
  • Notification Service: A general-purpose service, responsible for sending real-time alerts. This service will notify drivers of hail requests, alert passengers when a bus is approaching, and send system alerts to administrators.

Database Design

For the database, a hybrid approach using both SQL and NoSQL databases would be effective:

  • Relational Database (e.g., PostgreSQL): This should be used by the Metadata Service. As the Dropbox design points out, metadata requires strong consistency and ACID properties, which relational databases provide. This database will store structured information like user accounts, bus routes, and schedules.​
  • Time-Series Database (e.g., Amazon Timestream): This is the best fit for storing the raw, high-frequency GPS location data from the buses. Its structure is optimized for time-based queries, which will be essential for analytics and for the Real-time Processing & ETA Service.
  • Caching Layer (e.g., Redis or Memcached): The Dropbox design highlights the importance of caching to improve performance and reduce database load. A caching layer should be placed in front of the database, especially for frequently accessed data like ETAs and bus locations. This is critical for handling the high read-to-write ratio expected from passengers using the mobile app.

Architecture Diagram

  • We chose a microservices architecture for scalability and fault tolerance, allowing us to scale individual components like the ETA service independently.
  • We are using a dedicated Metadata Service backed by a relational database to ensure data consistency, a pattern recommended in the Dropbox system design.
  • To ensure low latency for our users checking bus arrival times, we’ve included a caching layer, which is a best practice for systems with high read-to-write ratios. Assumption (from Week 1)
  • Number of Buses: 150 buses operating across all routes.
  • Bus Stops: 500 bus stops equipped with hailing buttons.
  • Operating Hours: 16 hours per day (6 AM to 10 PM).
  • Peak Hours: 4 hours per day (7-9 AM and 5-7 PM).
  • Active Users (Passengers): 20,000 daily active users (DAU).
  • Peak Concurrent Users: 5,000 users during peak hours.