Since starting work, I encounter various technical terms every day. As I do not have a formal background in computer science, many of these terms leave me confused. I decided to read a blog to collect and organize this information. Over time, I became familiar with them.
Since starting work, I encounter various technical terms every day. As I do not have a formal background in computer science, many of these terms leave me confused. I decided to read a blog to collect and organize this information. Over time, I became familiar with them.
Token, Session, Cookie Related#
First, let's explain their meanings:
-
Introduction of Token: A Token is created to alleviate the frequent requests from the client to the server, where the server often queries the database to compare the username and password, determining whether they are correct and providing appropriate feedback. In this context, the Token was born.
-
Definition of Token: A Token is a string generated by the server to serve as a token for the client to make requests. After the first login, the server generates a Token and returns it to the client. From then on, the client only needs to include this Token in requests, without needing to provide the username and password again.
-
Purpose of using Token: The purpose of the Token is to reduce the server's load and minimize frequent database queries, making the server more robust.
Understanding the significance of Token helps us clearly see why we need to use it.
Second, how to use Token?
This is the focus of this article, where I will introduce two commonly used methods.
- Using device number / device MAC address as Token (recommended)
Client: The client obtains the device number/MAC address during login and passes it as a parameter to the server.
Server: Upon receiving this parameter, the server uses a variable to store it as a Token in the database and sets this Token in the session. Each time the client makes a request, it must intercept the request and compare the Token provided by the client with the Token in the server's session. If they match, the request is allowed; if not, it is rejected.
Analysis: At this point, the client and server share a unique identifier Token, ensuring that each device has a unique session. The downside of this method is that the client needs to pass the device number / MAC address as a parameter, and the server needs to store it. The advantage is that the client does not need to log in again; once logged in, it can continue to use the Token. The server handles the timeout issue. If the server's Token times out, it simply queries the database with the Token provided by the client and assigns it to the variable Token, thus restarting the Token's timeout countdown.
- Using session value as Token
Client: The client only needs to log in with the username and password.
Server: After receiving the username and password, the server checks them. If correct, it returns the locally obtained sessionID as the Token to the client, which can then use it to request data.
Analysis: The benefit of this method is convenience, as it does not require data storage. However, the downside is that when the session expires, the client must log in again to access data.
Third, what problems arise during use and their solutions?
We have just introduced two methods of using Token, but various issues can arise during use. In the first method, we overlooked a problem that can lead to multiple submissions of data during poor network conditions or concurrent requests.
Solution to this problem: Wrap the session and Token together to resolve this issue. How to wrap them? Please see this explanation:
A session is a unique identifier that maintains communication with the server throughout a single user's operation. In multiple requests from the same user, the session always represents the same object, not multiple objects, as it can be locked. When multiple requests from the same user enter, the session can restrict access to one direction. This article uses the session and incorporates the Token into it to verify whether the same user has made concurrent duplicate requests. When the last request arrives, the Token in the session is used to verify whether the Token in the request is consistent. If they do not match, it is considered a duplicate submission and will not be allowed.
This is the solution to prevent duplicate submissions.
Fourth, session, token, and cookie
What is session authentication?
The HTTP protocol is stateless, so if a user provides a username and password for authentication, they must include them again in the next request. To allow the backend application to recognize which user made the request, the server must store a copy of the user's login information, which will also be returned to the browser (frontend) in response to the frontend request. The frontend saves this as a cookie, which is sent to the backend application in the next request, allowing the backend to identify which user made the request. This is traditional session authentication.
What is the process of session authentication?
When the browser first accesses the server, the server creates a session and generates a unique session key, known as sessionID. The sessionID and corresponding session are saved as key-value pairs in cache or can be persisted in a database. The server then sends the sessionID to the client in the form of a cookie. When the browser accesses again, it sends the sessionID from the cookie. The server matches the request based on the sessionID.
If the browser disables cookies or does not support them, this can be sent to the server through URL rewriting.
What is token authentication?
A token is a string generated by the server, serving as an identifier for the client to make requests.
After the user logs in for the first time, the server generates a token and returns it to the client. From then on, the client only needs to include this token in requests, without needing to provide the username and password again.
What is the difference between token authentication and session authentication?
Both token and session are used for identity verification. Session is generally translated as "session," while token is often translated as "token."
The server saves a copy of the session, which may be stored in cache, files, or databases. Both session and token have expiration times and need to be managed.
In fact, the issue with token and session is a trade-off between time and space: session exchanges space for time, while token exchanges time for space. The choice between the two depends on the specific situation.
While both involve "the client records and carries it with each access," tokens can easily be designed to be self-contained, meaning the backend does not need to keep track of anything. Each request is stateless, with verification done through decryption, yielding a legal/illegal conclusion on the spot. This entire judgment relies on some logic solidified on both the client and server sides, making the information self-contained. This is true statelessness.
On the other hand, sessionID is typically a random string that requires backend verification for its validity. What if the server restarts and the session in memory is lost? What if the Redis server crashes?
Plan A: I give you an ID card, but it's just a piece of paper with the ID number written on it. Each time you come to do something, I check your ID in the backend to see if it's valid.
Plan B: I give you an encrypted ID card. From then on, as long as you show this card, I know you are definitely one of us.
That's the difference.
Both authentication methods aim to solve the problem of maintaining state in the HTTP protocol, as HTTP is a stateless protocol.
Rest Services#
REST should meet the following conditions:
Use a client/server model (C/S structure, a network architecture that separates the client from the server. Each instance of client software can send requests to a server or application server.)
For example, front-end and back-end separation, where the page and service do not run on the same server.
Hierarchical system
For example, a parent system has multiple child modules, each module being an independent service.
Stateless
The server does not save any state about the client, meaning that to access backend services, the token must be passed.
Cacheable
For example, the server can cache information about logged-in users through tokens. The client requests will carry a token, and the backend service retrieves user information from the cache using the provided token, improving efficiency.
Uniform interface
For example, all modules of a project are integrated into one package, all-in-one, consolidated under one port.
If a system meets the five constraints listed above, it is considered RESTful (a software architectural style, not a standard, providing a set of design principles and constraints. It is mainly used for software that interacts between clients and servers. Software designed based on this style can be simpler, more hierarchical, and easier to implement caching mechanisms.)
Eureka#
- Introduction to Eureka
Eureka is a service discovery framework developed by Netflix. It is a REST-based service primarily used to locate middleware services running in the AWS domain for load balancing and middleware service failover. SpringCloud integrates it into its subproject spring-cloud-netflix to achieve service discovery functionality in SpringCloud.
1.1 Eureka Components
Eureka consists of two components: Eureka Server and Eureka Client.
1.1.1 Eureka Server
Eureka Server provides service registration services. After each node starts, it registers with the Eureka Server, so the service registry in the Eureka Server will store information about all available service nodes, which can be visually seen in the interface.
Eureka Server itself is also a service and will automatically register with the Eureka registry center by default.
If you set up a standalone version of the Eureka Server registry center, you need to configure it to cancel the automatic registration logic of the Eureka Server. After all, registering the current service with the registry center it represents is illogical.
Eureka Server provides service registration, discovery, and heartbeat detection through interfaces like Register, Get, and Renew.
1.1.2 Eureka Client
Eureka Client is a Java client that simplifies interaction with the Eureka Server. The client also has a built-in load balancer that uses a round-robin load balancing algorithm. After the application starts, it sends a heartbeat to the Eureka Server, with a default cycle of 30 seconds. If the Eureka Server does not receive a heartbeat from a node for multiple cycles, it will remove that service node from the service registry (default 90 seconds).
Eureka Client has two roles: Application Service (Service Provider) and Application Client (Service Consumer).
1.1.2.1 Application Service
The service provider that is registered with the Eureka Server.
1.1.2.2 Application Client
The service consumer that discovers and consumes services through the Eureka Server.
Here, Application Service and Application Client are not absolute definitions, as a Provider can consume services from other Providers while providing its own services; a Consumer can also provide external services while consuming services.
SpringMVC Spring Boot Spring Cloud#
Spring Boot is merely a configuration tool, integration tool, and auxiliary tool.
Spring MVC is a framework, the actual running code in the project.
The Spring framework is like a family, with many derivative products such as Boot, Security, JPA, etc. However, they all share the foundation of Spring's IoC and AOP, where IoC provides a dependency injection container, and AOP addresses aspect-oriented programming. Other advanced features of extended products are built on these two foundations.
Spring MVC provides a loosely coupled way to develop web applications. It is a module of Spring and a web framework. Through Dispatcher Servlet, ModelAndView, and View Resolver, developing web applications becomes easy. The problem domain it addresses is website application or service development—URL routing, session management, template engines, static web resources, etc.
Spring Boot implements automatic configuration, reducing the complexity of project setup. It primarily addresses the cumbersome configuration required when using the Spring framework, so it is not a replacement for Spring but closely integrates with it to enhance the developer experience. It also integrates a large number of commonly used third-party library configurations (such as Jackson, JDBC, Mongo, Redis, Mail, etc.), allowing these libraries to be used with almost zero configuration in Spring Boot applications.
Spring Boot is merely a facilitator, helping to simplify the project setup process. If it is a web project, using Spring MVC as the MVC framework, the workflow is exactly the same as described above, as this part of the work is done by Spring MVC, not Spring Boot.
For users, switching to Spring Boot changes the project initialization method and configuration files, and there is no need to install container servers like Tomcat separately. Maven can directly run the jar package as a website, but the core business logic implementation and business process remain unchanged.
In summary: Spring initially utilized the "factory pattern" (DI) and "proxy pattern" (AOP) to decouple application components. People found it quite useful, so they created an MVC framework (some components decoupled using Spring) for web application development (Spring MVC). Then they realized that they were writing a lot of boilerplate code for each development, leading to the creation of "lazy integration packages" (starters), which is what Spring Boot is.
So, in the simplest terms:
Spring is an "engine";
Spring MVC is an MVC framework based on Spring;
Spring Boot is a rapid development integration package based on conditional registration in Spring 4.
Spring Cloud is a complete microservices framework, an orderly collection of various frameworks. It combines mature and practically tested service frameworks developed by various companies, re-packaging them in a Spring Boot style to shield complex configurations and implementation principles, ultimately providing developers with a simple, understandable, easy-to-deploy, and easy-to-maintain distributed system development toolkit. It cleverly simplifies the development of distributed system infrastructure, such as service discovery registration, configuration center, message bus, load balancing, circuit breakers, data monitoring, etc., all of which can be started and deployed with one click using Spring Boot's development style.
What is Token
Detailed Explanation of Spring Basic Knowledge
Introduction to Eureka
Feign Design Principles
Spring Boot and Spring Cloud
HTTP is that Simple
Introduction to Message Queues
Detailed Analysis of Dynamic Proxies
CAS#
CAS stands for Compare and Swap. Literally, it means to compare and update. In simple terms: take a value V from a certain memory location and compare it with an expected value A. If the memory value V matches the expected value A, we update the new value B into memory. If they do not match, we repeat the above operation until successful. CAS is an optimistic lock, while synchronized is a pessimistic lock.
IO Multiplexing#
As a side note: we need to carefully discuss the I/O multiplexing mechanism, as this term is often too simplistic and not well understood by the general public. Let me illustrate with an analogy: Xiao Qu runs a courier service in City S, responsible for same-city express delivery. Due to funding constraints, Xiao Qu hires several couriers but realizes that he can only afford one vehicle for deliveries.
Operating Method One
For each package delivered, Xiao Qu assigns a courier to monitor it, and the courier drives the vehicle to deliver the package. Over time, Xiao Qu discovers that this method has the following issues: dozens of couriers spend most of their time fighting for the vehicle, with most couriers idling. The one who gets the vehicle can deliver the package. As the number of packages increases, so do the couriers, and Xiao Qu finds the courier shop getting crowded, making it impossible to hire new couriers. Coordination among couriers takes a lot of time. Given these drawbacks, Xiao Qu decides to adopt the following operating method.
Operating Method Two
Xiao Qu hires only one courier. When customers deliver packages, Xiao Qu labels them according to their delivery locations and places them in a designated area. Finally, the courier picks up the packages one by one, drives the vehicle to deliver them, and returns to pick up the next package.
Comparison
Comparing the two operating methods, it is clear that the second method is more efficient and better. In this analogy:
Each courier ————> Each thread
Each package ————> Each socket (I/O stream)
The delivery location of the package ———> Different states of the socket
Customer package delivery requests ———> Requests from clients
Xiao Qu's operating method ———> Code running on the server
One vehicle ——————-> Number of CPU cores
Thus, we can conclude:
- Operating Method One represents the traditional concurrent model, where each I/O stream (package) is managed by a new thread (courier).
- Operating Method Two represents I/O multiplexing. Only a single thread (one courier) manages multiple I/O streams by tracking the state of each I/O stream (the delivery locations of each package).