Algorithms, Data Structure, System Design Problems

Friday, 25 June 2021

Quiz application database design

I would start with 4 simple tables:

Users

- user_id        auto integer
- regtime        datetime
- username       varchar
- useremail      varchar
- userpass       varchar

Questions

- question_id    auto integer
- question       varchar
- is_active      enum(0,1)

Question_choices

- choice_id        auto integer
- question_id      Questions.question_id
- is_right_choice  enum(0,1)
- choice           varchar

User_question_answers

- user_id        Users.user_id
- question_id    Questions.question_id
- choice_id      Question_choices.choice.id
- is_right       enum(0,1)
- answer_time    datetime

My thought on this table design is:

table Users is for storing registered user.
table Questions is for storing all your questions.
- It has is_active so that you can selectively display only active questions (using WHERE is_active = '1')
table question_choices is for storing all available options. It has is_right_choice which defines what choice is the right answer for particular question.
Table User_question_answers is for storing answer from your user.

It has is_right for faster lookup, to see whether that particular question and answer choice is right (based on is_right_choice previously defined).
It also has answer_time just to note when that particular user answer the question.

-------------------------------

Some Improvements:

Table Players:

ID int (primary key, identity)
Name nvarchar(100)

Table Questions:

ID int (primary key, identity)
TextOfTheQuestion nvarchar(100)
correctAnswer int (foreign key to Answers.ID)
a int (foreign key to Answers.ID)
b int (foreign key to Answers.ID)
c int (foreign key to Answers.ID)

Table Answers:

ID int (primary key, identity)
TextOfTheAnswer nvarchar(100)

Table PlayerChoices:

PlayerID int (foreign key to Players.ID)
QuestionID int (foreign key to Questions.ID)

AnswerID int (foreign key to Answers.ID)

Customer Order Database design

You need four tables, something like this:

Possible Simplified Database Model

Customers

Contains a list of customers. One row per Customer. Would contain all the customers information - their contact details, etc...

Orders

Contains a list of orders. One row per order. Each order is placed by a customer and has a Customer_ID - which can be used to link back to the customer record. Might also store the delivery address, if different from the customers address from their record - or store addresses in separate tables.

OrderItems

Contains a list of order items. One row for each item on an order - so each Order can generate multiple rows in this table. Each item ordered is a product from your inventory, so each row has a product_id, which links to the products table.

Products

Contains a list of products. One row per product. Similar to the customers table, but for products - contains all the product details.

Here's the SQL code that you could use to create this structure - it will create a database for itself called mydb:

CREATE SCHEMA IF NOT EXISTS `mydb` DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci ;
USE `mydb` ;

-- -----------------------------------------------------
-- Table `mydb`.`Customer`
-- -----------------------------------------------------
CREATE  TABLE IF NOT EXISTS `mydb`.`Customer` (
  `ID` INT NOT NULL ,
  `Name` TEXT NOT NULL ,
  `PhoneNo` VARCHAR(45) NULL ,
  PRIMARY KEY (`ID`) )
ENGINE = InnoDB;


-- -----------------------------------------------------
-- Table `mydb`.`Order`
-- -----------------------------------------------------
CREATE  TABLE IF NOT EXISTS `mydb`.`Order` (
  `ID` INT NOT NULL ,
  `customer_id` INT NULL ,
  PRIMARY KEY (`ID`) ,
  INDEX `fk_Order_1_idx` (`customer_id` ASC) ,
  CONSTRAINT `fk_Order_1`
    FOREIGN KEY (`customer_id` )
    REFERENCES `mydb`.`Customer` (`ID` )
    ON DELETE NO ACTION
    ON UPDATE NO ACTION)
ENGINE = InnoDB;


-- -----------------------------------------------------
-- Table `mydb`.`Product`
-- -----------------------------------------------------
CREATE  TABLE IF NOT EXISTS `mydb`.`Product` (
  `ID` INT NOT NULL ,
  `Name` VARCHAR(45) NOT NULL ,
  `Description` TEXT NULL ,
  PRIMARY KEY (`ID`) )
ENGINE = InnoDB;


-- -----------------------------------------------------
-- Table `mydb`.`OrderItem`
-- -----------------------------------------------------
CREATE  TABLE IF NOT EXISTS `mydb`.`OrderItem` (
  `ID` INT NOT NULL ,
  `Order_ID` INT NOT NULL ,
  `Product_ID` INT NOT NULL ,
  `Quantity` INT NOT NULL ,
  PRIMARY KEY (`ID`) ,
  INDEX `fk_OrderItem_1_idx` (`Order_ID` ASC) ,
  INDEX `fk_OrderItem_2_idx` (`Product_ID` ASC) ,
  CONSTRAINT `fk_OrderItem_1`
    FOREIGN KEY (`Order_ID` )
    REFERENCES `mydb`.`Order` (`ID` )
    ON DELETE NO ACTION
    ON UPDATE NO ACTION,
  CONSTRAINT `fk_OrderItem_2`
    FOREIGN KEY (`Product_ID` )
    REFERENCES `mydb`.`Product` (`ID` )
    ON DELETE NO ACTION
    ON UPDATE NO ACTION)
ENGINE = InnoDB;

USE `mydb` ;

Monday, 21 June 2021

Optimistic and Pessimistic locking Database

Optimistic Locking is a strategy where you read a record, take note of a version number (other methods to do this involve dates, timestamps or checksums/hashes) and check that the version hasn't changed before you write the record back. When you write the record back you filter the update on the version to make sure it's atomic. (i.e. hasn't been updated between when you check the version and write the record to the disk) and update the version in one hit.

If the record is dirty (i.e. different version to yours) you abort the transaction and the user can re-start it.

This strategy is most applicable to high-volume systems and three-tier architectures where you do not necessarily maintain a connection to the database for your session. In this situation the client cannot actually maintain database locks as the connections are taken from a pool and you may not be using the same connection from one access to the next.

Pessimistic Locking is when you lock the record for your exclusive use until you have finished with it. It has much better integrity than optimistic locking but requires you to be careful with your application design to avoid Deadlocks. To use pessimistic locking you need either a direct connection to the database (as would typically be the case in a two tier client server application) or an externally available transaction ID that can be used independently of the connection.

In the latter case you open the transaction with the TxID and then reconnect using that ID. The DBMS maintains the locks and allows you to pick the session back up through the TxID. This is how distributed transactions using two-phase commit protocols (such as XA or COM+ Transactions) work.

Sunday, 20 June 2021

Snapchat System Design

Snap Chat

This messenger sends snaps. Plain text cannot be sent or send very rarely.
- snap = Text + Video OR
- snap = Text + Image.

	FB-Messenger	Whatsapp	Snapchat
Creating Groups	y	y	n
Encryption	y	y	n
Media	Images/video/documents/audio	same as fb-msg	Images/video
Location sharing	y	y	n
Storage/Backup	images,text gets stored in gallery	same as fb-msg	not allowed
Automatic Deletion on Images	n	n	yes in 24 hours
Use case	-	messaging & sharing important documents	For entertainment

1. Requirements

Functional
- a. Sending snaps: User-A can send snap(Video, images, text) to user-B.
- b. Holding & Deleting snaps:
  - Snapchat service should hold the message for 24 hours, if in case user is not online and snap cannot be delivered instantly. Eg: User is Flight, once user comes online snap gets delivered to him, then snapchat server can delete from its DB.
  - if user is online, service will deliver message and delete snap instantly from DB.
- c. Searching a user: Unique userId is provided to users and all users are searched on basis of userId not phone numbers.
Non-Functional
- a. User-A can add any user as his friend based on userid and send snap.
- b. [Thumbnail to be shown to reciever.
- c. Normal filters on video,image provided to sender
- d. System should be Eventual consistent & Highly available.
Extended
- a. Blocking the users
- b. Following a user

2. BOE

snapchat revleaved in 2018 that number of snaps sent per day = 4 Billion. Assumed in 2021 = 10 Billion

QPS(Queries per second)
- 10 Billion / 86400 = 1010 / 105 = 105 = 1 lac queries per second
Storage Estimates: Will try overestimate

App Message: Metadata:75 bytes, Video:1MB
  Dst_user_id(10 bytes)   Allowed 20 characters. 1 character:4 bytes
  self_user_id(10 bytes)  Apart from IP address, client will send its unique user id
  timestamp(4 bytes)      Time elapsed from epoch(1 Jan 1970) in seconds
  Text(50 bytes)          Text sent with vieo or image
  Video(1 MB)             Video message

|App message|TCP(20 bytes)|IP(20 bytes)|DL(18 bytes)|         
//Note TCP,IP,DL Headers are not stored, these are used by router not snapchat.

                    1 day         //Assuming snapchat stores offline user data for 24 hours only
Metadata  = 75B x 10 Billion = 750 Billion  = .7 TB   //1 High end computer having 1 TB Hard disk
Videos    = 1MB x 10 Billion = 10 pow 15    = 1 PB    //1000 High end computers having 1 TB Hard disk

3. API Design

All API's should be extensible, ie if in future want to add new parameters that can be added). Variadic Templates C++.

A. Sender call these APIs

Deduplication is avoided using messageId. if server finds messageId already in DB, it will consider message as duplicate and discard it.

sendMessage ( string dst_user_id,
              string src_user_id,
              uint32_t timestamp,           //sec elapsed from Epoch(1/1/1970)
              string text,                  //
              uint8_t video, 
              uint32_t messageId)           //Identifies every message Uniquely.

B. Reciever call these APIs

Reciever will get snaps from server in 2 ways:
- a. PUSH: Whenever reciever comes online, server pushes the pending messages.
- b. PULL: Whenever reciever want to get messages it refreshes and takes message from server.

struct Message_details {
  string text,
  string video_url_on_object_store,
  string src_user_id
}
GetMessages (string self_user_id,        //Pull API
             uint32_t get_messages_after_timestamp,
             Message_details[]
)

PushAllMessages (Message_details[])    //Push API

4. HLD

Requirement-1: Sending snap from UserA to UserB Steps(7-10)

1. User opens snapchat app. Name resolution done using DNS.
2. CDN will be hosting user feed, periodically updating the page from App Server. CDN renders the feed.
3. User creates a snap, upload a (photo, Video).
4. User-B and sends to App-Server using ISP in JSON/XML in http.

App-Server                                                                                  User-A
      <- HTTPS(UserA_id, dst=UserB_id, timestamp, text, Video, messageId) | TCP | IP | DL |-

5. HTTPS message is decrypted using SSL Terminators.
6. Load Balancer selects App server based on Least Response Time Scheduling algo and sends packet.
7. Application-Server will send User_B_id, timestamp, messageId to DBFinder service.
8. DBFinder:
- Purpose of DBFinder?
  - Find and searches DB which stores table of User_B, that messageId exists or not, if not updates DB //See DB Design
  - Respond to AppServer, if messageId exits its duplicate else not
- Avoiding Deduplication?
  - Every user's snaps are stored in DB until he reads them. Eg: When User_A sends snap to User_B. This snap is stored in User_B's table in DB.
  - AppServer will check User_B's SQL Database table that messageId exists in table or not?
    - Worldwide all snapchat clients But once message is delivered, AppServer will inform clients to reuse/reset the messageId
  - if messageId exists then its duplicate message, drop it.
- We will place a Cache(Eg: Redis or Memcached). Redis preferred between Application server and DB.
9. Video,text,images are stored in different Databases(See DB Design), user-B,timestamp are pushed on MOM, user is acknowledged using zookeeper.
10. Sender Service will be subscriber to MOM and recieves (userB, timestamp). It will read DB_userB and send video,image,text to userB.

Requirement-2: Holding & Deleting snap (Steps 11,12)

11. After sender service receives ACK from user that message is read, it will push MessageId, userId on MOM.
12. Deletor service gets notifcation, reads messageID,userID from MOM, checks same in DB and deletes promptly.
- cronjob would be running on Database to delete the contents if age>24 hours.

Requirement-3: Searching the user

Application server will delegate task to DB-Finder to search userB database, if database is not found this means user is not present.

5. DB Design

We can store information in SQL or noSQL database.

1. SQL DB Design

User's Table
- Has one-to-many relationship with photo table, since 1 user can have multiple photos.

Primary Key: userID

| UserID | messageID | TextURL | VideoURL | Timestamp | src_UserID | email |
| ------ | --------- | ------- | -------- | --------- | ---------- | ----- |
| User_B | kanskna   | <>      |  <>      | 123       | User_A     |a@a.com|

Photo Table
- All photos which are posted by particular user. Each photo will have incremental photo-ID
- Photos will be stored in any distributed File Storage server: HDFS or GFS.

Primary Key: photo ID

| Photo-IDs | User-ID | Caption | location | Time of upload |   URL  |
| --------- | ------- | ------- | -------- | -------------- | ------ |
|    1      | user-A  | <>      |   <>     |  epoch time    |   -    |
|    4      | user-B  | <>      |   <>     |  epoch time    |   -    |
|    6      | user-A  | <>      |   <>     |  epoch time    |   -    |

Video Table //Similar to Photo Table
Follower Table
- let userA(follower) follows userB(followee).

| Followee | map<Followers> |
| -------- | -------------- |
| UserB    | UserA          |

2. NoSQL DB Design

We can store same data in noSQL Databases(Eg: dynamoDB, Cassandra) since noSQL are highly scalable, hence highly available and low latency.
Photo Table (Key, value)
- Key would photoId and value would be object containing all metadata related to photo(Eg: location/url, timestamp, size etc)

| UserID    |       <Key=PhotoIDs, value=Object_Containing_all_metadata>     |
| --------- | -------------------------------------------------------------- |
| userA     | key=photo-1, Value=Object<Location_of_photo, timestamp, size > |
            | key=photo-2, ...                                               |
```![image](https://assets.leetcode.com/users/images/a1de0852-522e-4185-ae24-6dcc09704453_1624202166.826889.jpeg)

Thursday, 17 June 2021

System Design : HOTEL booking system

Question 1 :

The purpose of the channel manager is to provide an easy way for hoteliers to host their listings on multiple online travel agents to increase global reach. A software that automates the tasks from updating rates to managing inventory. You have one interface to manage all of your channels

Users

Hotel management Systems like Taj Hotels Group: Hotel owners and managers who independently run their operations will be direct consumers. eg : Hotel Anshuman in Mahipalpur Delhi.

Online Travel Agents : The channel manager will be connected to world wide travel agents like Booking.com, Expedia, Agoda.

USECASES

AS HOTEL MANAGEMENT SYSTEMS

Onboard onto the Channel Manager
Sync property information
Sync availability & pricing information
Consume Bookings

AS CHANNEL MANAGER

Connect to different OTA’s
Sync Property Data & Availability & Pricing
Retrieve Bookings