🗂 目录

系统设计面试指南

May 23, 2020 • 预计阅读时间 6 分钟

系统设计的关键点在于知识面广 + 考虑全面,这需要经验做支撑,但通过多学习案例、套路是可以弥补的

1. 答题技巧 & 策略
2. 重要数据指标
3. 技术点
4. 问题集 & 经典案例
5. 学习资料

这里推荐的课程,资料,全部都在 b 站、油管、TG、BT、网盘, 自行寻找 🈶️ 💯 🆓

1. 答题技巧 & 策略

和算法面试相比,系统设计的问题和解答都比较发散,无固定线路,过程更像是不断的发现和讨论,不存粹是找出答案。

Design Gurus 的 6Step 六步解题法:

1. 明确需求
2. 定义系统接口
3. 粗略估算
4. 定义数据模型
5. 总体概要设计
6. 详细设计
7. 系统瓶颈,开放问题 & 总结陈述

九章总结的 4S 分析法:

-   Scenario 场景:需要设计哪些功能,到什么程度?
    -   who will use
    -   how many will use
    -   usage pattern
    -   use case not covered
    -   estimated throughput
    -   estimated latency
-   Service 服务:将大系统拆分为小服务
    -   api for read/write scenarios
    -   schema
-   Storage 存储:数据如何存储与访问
    -   data size
    -   read/write ratio
    -   read/write traffic
    -   rdbms/nosql
-   Scale 升级:解决缺陷,处理可能遇到的问题
    -   scaling the algorithm
    -   scaling the individual component
    -   memory cache
    -   DNS, CDN, reverse proxy, load balancer, ...
    -   async: message queue, back pressure, time & order, ...
    -   communiction: tcp, udp, rest, rpc, ...

2. Non-Functional Requirements

Non-Functional Requirements

3. 重要数据指标

具体单项数据应该不用死记,下面的数据用于估算:

延迟(latency)

latency

CPU 访问(包括 CPU 缓存):10ns
L1 缓存:0.5ns
L2 缓存:5ns
主内存访问:100ns(例如Redis访问速度)
同机房网络时延:1ms
异地网络时延:10ms
国际网络时延:100ms
HDD 磁盘搜索:10ms
HDD 磁盘访问:10ms,如果是 SSD 大约快 100 倍,100ns
HDD 磁盘吞吐量:100 MB/s,如果是 SSD 则高几倍  
1MB 内存连续读取: 250us  
1MB 网络连续读取: 10,000us = 10ms   
1MB 磁盘连续读取: 30,000us = 30ms  
数据库插入操作:1ms  

Round-trips within a data center at 2,000 trips/sec.
World-wide round trips at 6-7 trips/sec.

吞吐量(throughput)

Web 服务器的 QPS:1000
RDB 单机 QPS:1000
NoSQL DB 磁盘单机 QPS:10K
内存访问单机 QPS:1M

1 million reqs/month --> .4 reqs/sec
2.6 million reqs/month --> 1 reqs/sec
5 million reqs/month --> 2 reqs/sec
10 million reqs/month --> 4 reqs/sec
100 million reqs/month --> 40 reqs/sec
1 billion reqs/month --> 400 reqs/sec

系统吞吐量有三个考量因素:

  1. QPS/TPS
  2. Concurrency
  3. Response Time (RT)

QPS/TPS = concurrency/average response time

常用的估算指标包括:QPS peak QPS(QPS 峰值) storage cache 服务器数量

举例,如果一个页面要包含 30 个缩略图,图的大小为 256kb,如果顺序读取,那么花在磁盘的的时间就是:

30 seek * 10ms/seek + 30 * 256kb / 30mb/s = 560ms

如果并行读取:

10ms/seek + 256kb / 30mb/s = 18ms

End-to-end latency = client processing latency + Network latency + server processing latency

我们再通过一个实例来把上面几个概念串起来理解。按二八定律来看,如果每天 80% 的访问集中在 20% 的时间里,这 20% 时间就叫做峰值时间。

公式:( 总PV数 * 80% ) / ( 每天秒数 * 20% ) = 峰值时间每秒请求数(QPS) 机器:峰值时间每秒QPS / 单台机器的QPS = 需要的机器

单线程QPS公式:QPS = 1000ms/RT

最佳线程数量:刚好消耗完服务器的瓶颈资源的临界线程数,公式如下

最佳线程数量=((线程等待时间 + 线程cpu时间)/ 线程cpu时间 )* cpu数量

特性:

  • 在达到最佳线程数的时候,线程数量继续递增,则QPS不变,而响应时间变长,持续递增线程数量,则QPS开始下降。
  • 每个系统都有其最佳线程数量,但是不同状态下,最佳线程数量是会变化的。
  • 瓶颈资源可以是CPU,可以是内存,可以是锁资源,IO资源:超过最佳线程数-导致资源的竞争,超过最佳线程数-响应时间递增。

高可用

System Availability = Uptime ÷ (Uptime + Downtime)

高可用

SLA, SLI, SLO

SLA, SLI, SLO

SLA 通常是一个比较商业化的指标,他们之间的关系和区别用一个例子解释就是 SLA 就是你信用卡的额度,SLO 就是你的预算(SLA 决定 SLO),SLI 就是你的各项实际花费,用来监控是否符合预算。常用的 SLI 包括:

  • uptime of service
  • no. of transaction
  • latency
  • error rate
  • throughput
  • response time
  • durability

4. 技术点

CDNCacheProxyLoad BalanceAPI GatewayService MeshRestGraphicqlRPCMessaging QueuePub/Sub, Pull/PushVertical ScalingHorizontal ScalingShardingPartitioningNoSQLIndexingKev-Valuemulti-datacenter replicationConsistent HashingDistribute ID/UUIDACIDCAPPaxioRaft,展开的话范围相当的广,只能平时积累,逐个突破:

云原生时代|分布式系统设计知识图谱(内含 22 个知识点) 分布式系统设计知识图谱

互联网应用的典型架构,架构是互通的,了解后至少懂得大方向,参考 👉 淘宝服务端高并发分布式架构演进之路

5. 问题集 & 经典案例

突击 or 有架构经验的话,我觉得只要认真研读案例即可:

问题/案例GitHubYoutube我的解答 ⏳
Airbnb(共享服务)
Amazon,eBay,Shopify(电商)解答codeKarle
Dropbox(网络文件)Dropbox Google Doc S3
Facebook Newsfeed
Facebook Social
Instagram(图片分享)解答Gaurav Sen
Parking Lot
Search Engine
TikTok(短视频分享)
Tinder(约会网站)
TinyUrl, Bit.ly(短链接)解答Tech Dummies Narendra L
Twitter解答花花酱
Tech Dummies Narendra L
Uber, Yelp(基于地理位置服务)Tech Dummies Narendra L
Web Crawler(爬虫)解答Tech Dummies Narendra L
WhatsApp, Facebook Messagener, WeChatGaurav Sen
YouTube, Netflix, Bilibili花花酱 Gaurav Sen Tech Dummies Narendra L
ZoomcodeKarle
API Rate Limiter解答Gaurav Sen
Limited Order Book解答

6. 学习资料

系统设计课程(英文)

🇺🇸

Design Gurus:We are a team of senior engineers and managers from Facebook, Google, Microsoft, Lyft, and Amazon

  • Educative: Scalability & System Design for Developers by Design Gurus

    As you progress in your career as a developer, you’ll be increasingly expected to think about software architecture. Can you design systems and make trade-offs at scale? Developing that muscle is a great way to set yourself apart from the pack. In this learning path, you’ll cover everything you need to know to design scalable systems for enterprise-level software. Buckle in.

  • Educative: Grokking the System Design Interview by Design Gurus

    System design questions have become a standard part of the software engineering interview process. Performance in these interviews reflects upon your ability to work with complex systems and translates into the position and salary the interviewing company offers you. Unfortunately, most engineers struggle with SDI, partly because of their lack of experience in developing large-scale systems and partly because of the unstructured nature of SDIs. Even engineers who’ve some experience building such systems aren’t comfortable with these interviews, mainly due to the open-ended nature of design problems that don’t have a standard answer.

    This course is a complete guide to master SDIs. It is created by hiring managers who’ve been working at Google, Facebook, Microsoft, and Amazon. We’ve carefully chosen a set of questions that have been repeatedly asked at top companies.

  • Educative: Grokking the Advanced System Design Interview by Design Gurus

    System design questions have increasingly become an integral part of software engineering interviews. For senior engineers, the discussion around system design is considered even more important than solving a coding question. In a system design interview, you can show your real design skills and show how they will work with designing complex systems. It is a given that a good performance in system design interviews will get you a senior position and result in higher salaries.

    This course presents the architectural review of famous distributed systems. The main goal is to extract out important design details that are relevant to system design interviews. The course also presents a list of system design patterns that constitute the common design problems and their solutions that different distributed systems have developed over time.

系统设计课程(中文)

🇨🇳

GitHub 资料

GitHub 上的资料 - 除了拿来面试,也是学习架构的大宝藏 💎:

系统设计 & 架构interview

  上一篇:股票入门的姿势(2021.2更新)

  下一篇:Jackson 常用注解

comments powered by Disqus