top of page

Ich bin ein Titel.​ Klicken Sie hier, um mich zu bearbeiten.

Mac showing wrong date and time despite trying everything

Mac showing wrong date and time despite trying everything

Not very sure about the root cause of the issue but often synching the Apple time servers fixes the issue.

From Terminal.app you you can re-sync with Apple time servers, copy and paste: sudo sntp -sS  time .apple.com

Spring boot3: Use jetty instead of tomcat

Spring boot3: Use jetty instead of tomcat

Issue: I understand that Spring Boot 3 is currently in high demand in the market, and many applications are eager to migrate to it, which is indeed a positive trend. However, it's crucial to be aware that there are some significant changes that may pose challenges. I recently encountered a similar issue when attempting to use Jetty instead of Tomcat. Error: java.lang.ClassNotFoundException: jakarta.servlet.http.HttpSessionContext with Spring Boot 3 and Jetty server Solution: Upon investigation, I found that Spring Boot 3 requires the inclusion of several additional libraries in the project's `pom.xml` file. These libraries are listed below: <dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-jetty</artifactId>
</dependency>
<dependency>
<groupId>jakarta.servlet</groupId>
<artifactId>jakarta.servlet-api</artifactId>
<version>6.0.0</version>
</dependency>
<dependency>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-server</artifactId>
<version>11.0.14</version>
</dependency> Please let me know in the comments section if the above solution doesn't work for you and thanks for checking out my site

Concurrency: How does Postgres handle it like G.O.A.T 🦸 (Part-1)

Concurrency: How does Postgres handle it like G.O.A.T 🦸 (Part-1)

Concurrency: Concurrency is crucial in application design/development as most modern applications need to support concurrent requests. This article explores how Postgres Database manages concurrency. Database concurrency is the ability to support multiple users and processes working on the database concurrently Before diving in deep, let's first set our basics right. Understanding what are the expectations from a database (Relational) ACID: Atomicity (A): "All or nothing" principle, Operations must either be complete successfully as a whole or be rolled back entirely. For instance, transferring money between accounts involves two steps, which, if not handled properly, can lead to inconsistencies. By encapsulating the transfer within a single transaction using {BEGIN and COMMIT/ROLLBACK} basically executing it as one unit, the database ensures that either the entire transaction succeeds, preserving the DB state, or it fails, leaving the state unchanged. Consistency (C): Data must be in a consistent state when the transaction starts and ends. Isolation (I): All concurrent transactions should not interfere with each other even though they are running concurrently. Very easy to explain in theory but a bit hard to achieve isolation will explain more in detail soon. Durability (D): A successful transaction should write the data back to persistent storage instead of volatile memory even in the case of system failures. The presence of the above four properties ensures that a transaction completes in expected behavior without any anomalies. When the database possesses these properties, they are said to be an ACID-compliant database. Read Phenomena (Concurrency Anomalies): In the concurrent world, multiple transactions run simultaneously and may interfere with each other, leading to various read phenomena. These phenomena can occur when a low level of isolation is used, and they encompass four distinct types of concurrency issues that databases can encounter: Dirty Read Non-Repeatable Read Phantom Read Lost Update These concurrency problems arise due to the concurrent execution of transactions and the potential inconsistencies that can result from reading and writing data concurrently. Dirty Read: A dirty Read occurs when a transaction (T1) reads uncommitted data from another transaction (T2). This can result in various issues. For instance, consider a scenario where Bob's account balance is initially $1000. In transaction T1, his balance is updated to $110, but T1 has not been committed or rolled back yet. If another transaction reads Bob's balance during this intermediate state and takes actions based on that value, it can lead to problematic outcomes. Note: ❌ icon emphasises the problem with this phenomenon Non-Repeatable Read: Non-Repeatable Read is a phenomenon that can be challenging to grasp, possibly due to its name. It occurs when a transaction (T1) reads the same record twice and observes a different version of the record in the second read. This discrepancy arises because another concurrent transaction (T2) modified the record in between the two reads by making updates. As a result, T1 perceives a change in the record's values, leading to the non-repeatable read phenomenon. Note: ❌ icon emphasizes the problem with this phenomenon For the first read in T1, the interest rate was 5% and it got changed to 5.5% by T2 this will be a problem because T1 is able to get the new interest rate which is 5.5% in the second read. Phantom Read: Phantom Read is a phenomenon similar to Non-Repeatable Read, wherein a transaction T1 observes a different set of records during subsequent reads due to the recent addition/updation/deletion of data caused by another committed transaction executing concurrently. Note: ❌ icon emphasizes the problem with this phenomenon T1 observes a different sum of quantities during the second read compared to the first because it can access the modifications made by transaction T2. Lost Update: Lost Update is a phenomenon that occurs when two transactions independently read and attempt to update the same object in separate transactions. The issue arises when the transaction that commits last overwrites the updates made by the previous transaction. As a result, the changes made by the first transaction are lost or overwritten, leading to the lost update phenomenon. Note: ❌ icon emphasizes the problem with this phenomenon Expectations from a database include the ability to eliminate the aforementioned anomalies and provide support for ACID properties. Notably, only the serializable isolation level is capable of supporting all ACID properties and resolving all anomalies. However, it is important to note that using the serializable isolation level involves certain trade-offs, which will be explored further in subsequent sections. Isolation levels to rescue: In order to fix all the read phenomena, ANSI( A merican N ational S tandard I nstitute) defined four isolation levels: READ UNCOMMITTED READ COMMITTED REPEATABLE READ SERIALIZABLE The provided illustration represents the presence of read phenomena across different isolation levels. It is evident that the lowest level of isolation, READ UNCOMMITTED, fails to address any of the phenomena. Therefore, READ UNCOMMITTED can be ruled out, which explains why the default isolation level in Postgres is set to READ COMMITTED. Source: https://en.wikipedia.org/wiki/Isolation_(database_systems) Can we observe these levels of isolation in action? Pre-requisites: Install Rancher-Desktop or Docker-Desktop to be able to run the containers. We will be running Postgres in a container. Refer to my earlier post to get to know more about Rancher Desktop installation. 2. Docker compose up and connect to running Postgres container: 3. Create a database of accounts using pgadmin. open http://localhost:5050 in a browser Get the docker-machine ip of the running Postgres container and use that to configure the new server Command to get the running docker container IP docker inspect -f \
'{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' \
<containerId> 4. Prepare some data for our interesting demo: Source: https://gist.github.com/jinagamvasubabu/c79718e7919e7054b373b11ef7477fac READ COMMITTED: READ COMMITTED isolation level is specifically designed to prevent Dirty Reads. It guarantees that you will always read data that has been committed, regardless of any ongoing transactions. With READ COMMITTED isolation, you can trust that the data you read is in a consistent and stable state, providing a reliable view of the database. Underneath the surface, the READ COMMITTED isolation level in Postgres leverages MVCC (Multi-Version Concurrency Control) or snapshots. MVCC functions by generating a distinct version or copy of the data for each transaction (identified by XID). These versions allow transactions to exclusively access the committed data versions. Benefits: Suitable for scenarios with high contention, potentially offering improved performance compared to other isolation levels. Challenges: However, relying solely on READ COMMITTED is not adequate, as it can lead to other phenomena such as Non-Repeatable Reads, Lost Updates, and Phantom Reads. To address these challenges, additional isolation levels or strategies may be required to ensure consistent and predictable behavior, depending on the specific requirements of the application. Repeatable Read: This particular isolation level will avoid Dirty reads, Lost Updates and non-repeatable reads completely. Avoiding Lost Updates: Let's see an example of how this isolation level avoids Lost Updates, Postgres throws an error if a Lost update anomaly is detected so that application can retry or fail fast to handle this. Explanation: In the given scenario, both T1 and T2 transactions involve reading the balance of 'Srinu'. Initially, T1 updates the balance of 'Srinu' to 90. Subsequently, T2 also attempts to update the balance of 'Srinu'. However, this time, T2 has to wait until T1 either commits or rolls back the changes to avoid the lost update phenomenon. As T2 waits for T1, it encounters an error message stating "could not serialize access due to concurrent update." when T1 commits the transaction. This error is raised to prevent concurrent updates that could result in inconsistencies. The client application can handle this error by either retrying the transaction to make it successful or failing fast, depending on the specific requirements of the application. By catching and handling the error, the client application can ensure proper synchronization and prevent lost updates, maintaining data integrity and consistency. Lost updates anomaly is a serious phenomena because this can lead to serious bugs if we don't avoid it. under the hood, this isolation level uses MVCC to avoid lost updates, will discuss more about MVCC in future sections. Avoiding Non-repeatable reads: As I explained before in previous sections, where a transaction reads the same row multiple times during its lifespan, but the values of the row change between the reads. This inconsistency can lead to unexpected and inconsistent query results. To avoid this we can use REPEATABLE READ Isolation level. Explanation: In this scenario, T1 reads the balance of 'Srinu' as 100, unaware that T2 has made modifications to the balance. The REPEATABLE READ isolation level ensures that T1 maintains a consistent view of the data within its transaction scope, avoiding any potential issues. When T1 attempts to modify the balance of 'Srinu', Postgres detects a concurrent update and throws an error: "could not serialize access due to concurrent update." This error is raised to prevent conflicting modifications that could result in data inconsistencies. By enforcing a repeatable read isolation level, Postgres ensures that transactions see a consistent snapshot of the data throughout their execution. This helps maintain data integrity and prevents unexpected behavior caused by concurrent updates. Serializable: The serializable isolation level is the highest isolation level. It prevents all anomalies, including dirty reads, non-repeatable reads, and phantom reads. Under the hood, it uses MVCC and Predicate locking to avoid any modifications to the data fetched during 1st read and it also throws an error so that the other concurrent transaction can retry to eventually make it successful Highs: Avoids all read phenomena (Dirty Reads, Phantom reads, Non Repeatable reads, Lost updates) Can be used where it is important to prevent users from seeing stale or inaccurate data. Lows: Reduced concurrency, avoid using this isolation level if your application requires high concurrency. In the next part, I will cover some advanced concepts which help in achieving the Postgres concurrency, Stay tuned for part-2 MVCC/Snapshots Locks (Shared Vs Exclusive locks) Optimistic vs Pessimistic locks References: https://wiki.postgresql.org/images/9/97/Concurrency.pdf https://www.postgresql.org/files/developer/concurrency.pdf https://www.zghurskyi.com/lost-update

Design Payment Gateway -Understanding the payment jargons

Design Payment Gateway -Understanding the payment jargons

Payment is a vast domain. To design a payment gateway, one must understand the business and the terms (jargon) behind it. I strongly feel that before designing any system, one should understand the problems it is trying to solve. In this article, I am going to explain the traditional credit card transaction lifecycle, entities, security, and the protocols involved during the transaction lifecycle. The primary entities involved in the card transaction lifecycle are as follows: Customer Merchant Customer's Bank (Issuer Bank) Merchant's Bank (Acquiring Bank) Card Scheme/Network Customer: A customer is someone who has a bank account and a credit/debit card that he or she uses to exchange items with a merchant. Merchant: A merchant runs a business to sell the goods to customers and receives money back from the customer for the purchased goods. Merchant holds an account with the bank and he/she uses that bank (processor) to accept card payments. Issuer/Issuing Bank (Customers bank): Only banks or financial institutions can issue credit or debit cards to customers, and that is the reason it is called an "issuer bank," because it issues the cards to customers. When a customer swipes the card at the merchant POS (Point of Sale), then that transaction is routed through a secure card network (e.g., Visanet) to the issuer bank, and the issuer bank can either approve or decline the transaction depending upon the customer. The acquirer bank (merchant) settles the money with the issuer after the transaction completes (clearing and settlement process). Acquirer Bank (Merchants bank): An acquirer bank is just like any other bank or financial institution. It helps merchants to accept card payments and is also called a processor (it processes the merchant's transaction requests). Acquirer banks also provide POS terminals to accept card payments. The acquiring bank maintains the transaction acquiring infrastructure as well. Card Scheme/Network: A card scheme or network is a central payment network that can accept debit or credit cards to process payments (e.g., Visanet). Visa, Mastercard, American Express, and UnionPay are big players in the card scheme business. Banks and other official financial institutions apply for membership in the scheme to issue credit or debit cards to customers. Payment Card Industry Data Security Standard (PCI DSS ) came up with 12 standards (requirements) for those handling cardholder data to ensure the security of the customer's card data. In simple words, a scheme is like a network that connects issuers and acquirers. A credit card transaction process goes through multiple phases (legs): Authorization Clearing and settlement Chargeback (Dispute case) 1. Authorization: The first stage in a transaction's life cycle The customer swipes the credit card that he got from a bank (Issuer Bank) at the merchant POS terminal for goods or services. Merchant POS takes the request and sends the transaction details to Merchants Bank (Acquirer Bank aka Processor) in a different message format ISO8583 . The acquirer bank receives this request, does some security validation, and sends it to the issuer bank via Card Network (Scheme). The Card Network acts like a router and a registry server; it maintains all the information of issuer and acquirer banks and sends the authorization request to the issuer bank. The Issuer Bank approves or declines the transaction request depending upon multiple factors like (is the card valid? Does the customer have enough credit? In the event of approval, the issuer bank puts the funds on hold (Authorization Hold). The issuer bank sends back the successful authorization code response to the acquirer bank (Processor). The issuer bank uses the card network to send the authorization code response to the acquiring bank over a TCP/IP connection with ISO8583 message format. Newer payment protocols are HTTP APIs over JSON/XML. The merchant technically didn't receive any funds for the goods or services provided and has to wait for the clearing and settlement process, which happens at regular intervals of the day, and this whole process is called authorization. 2. Clearing and settlement: When will the merchant get his money back for the goods he/she has provided to customer???? Merchants send all daily authorized transactions in a batch file using the Merchant POS terminal and send it to Merchants(Acquirer) Bank at the end of the day. Merchants with very large transaction data can send in multiple files instead of sending in one big file. The acquirer bank (processor) checks all the transactions provided by merchants with their transactions list(reconciliation) and checks if there are any discrepancies. For all the clean transactions acquirer credits the amount to the merchant account for T-1 transactions if the clearing cycle is T+1, where T = Today. Card Network(scheme) receives all the clearing files from different merchants and it sorts the files and sends them to the respective issuing bank. Schemes compute interchange fee and currency conversion charges (FX) (if applicable) and send one final transaction amount for the respective transaction and send the clearing file to the issuer. The issuer bank also does reconciliation with their authorized transactions and releases the funds to the acquirer bank. Let us take a look at different schemes and their clearing system Credit: https://www.youtube.com/c/LearnPayments 3. Chargeback (Dispute case): Chargeback is a consumer protection tool that allows consumers to get their money back for fraudulent transactions by submitting the complaint (Dispute) to the respective issuer bank (Customers bank). Chargeback is different from a refund, a refund is like asking the merchant to refund the money back for the defective item, and if the merchant agrees he will refund back the money for the same. A chargeback case comes when the merchant is not accepting your request to refund the money back. For more info refer to this cnbc.com/select/what-is-a-chargeback/ Before ending this topic 👋, I would like to talk about the below questions which you might be having in your mind What is reconciliation and why is it important? What message format is used to exchange the messages? How does the Issuer/Acquirer bank able to connect to Card Network? What is reconciliation and why is it important ? Reconciliation in simple terms, comparing your account balance for the day with the payment gateway transactions report for to check the income and expenses. Due to the penetration of online commerce, it's very tough to check the daily transactions coming from different sources, so maintaining separate software for doing the reconciliation will make your job easy for pay-in and pay-out flow. What message format is used to exchange the messages? All the request messages are exchanged between these systems in ISO8583 format and sent over TCP/IP Socket. Recent payment protocols are using HTTP over JSON/XML. How does the Issuer/Acquirer bank able to talk to Card Network? Mastercard/Visa provides a hardware/software called Mastercard Interface processor/Visa Access Network which helps in communication with Bank networks (Issuer/Acquirer) and this particular thing will be installed in the Bank datacentre just like a sidecar . It uses TCP/IP socket connection for the communication between these two networks. Credit: https://www.youtube.com/c/LearnPayments Note: In this article, we discussed the payment transaction lifecycle of a credit card but this more or less looks similar to other payment modes. References : https://underhood.blog/uber-payments-platform https://inc42.com/datalab/can-pos-startups-ezetap-and-mswipe-swipe-their-way-to-profitability-like-old-timer-pine-labs/ https://thinksoftware.medium.com/payment-gateway-system-design-how-does-the-stripe-work-917b2ba976f https://blog.cashfree.com/difference-between-payment-gateway-and-payment-processor/ https://brandshark.in/top-payment-processors-in-india-in-2022/ https://blog.cashfree.com/what-is-a-payment-gateway-india/?utm_source=paymentprocessor_blog&utm_medium=blog https://softensy.com/how-to-create-payment-gateway-and-become-a-payment-service-provider/ https://www.rbi.org.in/Scripts/NotificationUser.aspx?Id=11822&Mode=0 https://www.authorize.net/en-us/resources/how-payments-work.html https://razorpay.com/docs/payments/orders/apis/ Payment Gateway System Design | Payment Processing | System Design https://www.controlcase.com/what-are-the-12-requirements-of-pci-dss-compliance/ https://www.takepayments.com/blog/product-information/a-complete-guide-to-3d-secure-authentication/ https://www.lightspeedhq.com/blog/how-long-does-it-take-for-a-credit-card-payment-to-go-through/ https://merchants.fiserv.com/en-us/resources/payment-gateway-vs-payment-processor/ https://finezza.in/blog/differences-between-psps-and-payment-gateways/#:~:text=A%20PSP%20provides%20a%20merchant,a%20merchant%20and%20their%20customers . https://finezza.in/blog/all-about-payment-service-providers/ https://www.youtube.com/channel/UC_1w63fA-qhJMSehsCqVwDw https://business.ebanx.com/en/resources/payments-explained/credit-card-schemes

Replace docker with Rancher+Nerdctl 🔥

Replace docker with Rancher+Nerdctl 🔥

As most of you know, Docker Desktop and Docker CLI is not free to use from Jan 31st, 2022. Of course, it remains free for small businesses (fewer than 250 employees AND less than $10 million in annual revenue), personal use, education, and non-commercial open-source projects. So clearly I cannot use Docker for my day-to-day development. After lot of research, I found a combination which can be used to replace the docker. Reference: https://www.docker.com/blog/do-the-new-terms-of-docker-desktop-apply-if-you-dont-use-the-docker-desktop-ui/ Rancher Desktop + Nerdctl (containerd) Reasons: Almost no changes to do to run existing docker containers. Run your own local kubernetes cluster and can specify the version of kubernetes Rich user interface Nerdctl and docker CLI commands are almost similar Disclaimer: I haven't tested it on Apple Silicon M1 computers, but it worked perfectly on Apple Intel. Rancher Desktop: An open-source desktop application for Mac, Windows and Linux. Rancher Desktop runs Kubernetes and container management on your desktop. Rancher Desktop uses k3s (light weight kubernetes distribution) for local kubernetes cluster. Credit: https://rancherdesktop.io/ Pros: GUI interface You can choose the version of Kubernetes you want to run build, push, pull, and run container images using either containerd or Moby (dockerd) Open source Rancher Desktop alone is insufficient to totally replace Docker, including the CLI, which is where Nerdctl comes in to support the Docker CLI commands. Nerdctl: Nerdctl is a sub project of contai nerd . (open source container runtime) and it is a Docker-compatible CLI for containerd. Almost all nerdctl commands are similar to docker CLI commands except few like (docker system prune). Pros: Docker compatible CLI Supports Docker compose (nerdctl compose up/down) Can build, run docker images Quick Tip: For users who are only familiar with docker commands, you may build an alias and add it to your respective shell (.zhrc, .bashrc, etc.). alias docker = nerdctl Demo: 1. Download Rancher Desktop client: https://rancherdesktop.io/ 2. Update Kubernetes Settings: I have disabled the kubernetes for the sake of this article and also given 2GB memory and 2 Core CPU to start with and selected containerd as runtime 3. Supporting Utilities: Enable nerdctl Note: Nerdctl is a default cli for newer versions of Rancher Desktop. Supporting utilities section has been removed from UI 4. That's it, now you can run your docker containers in your local machine. docker compose up -d With out alias: nerdctl compose up -d Test: For the sake of testing, I had created a docker compose file to spin up postgres and pgAdmin containers in my local machine. docker-compose.yml: credit: https://github.com/khezen/compose-postgres Running docker compose up -d will spin up all the containers in detached mode. For more info, you can refer to this awesome video by devops-toolkit: References: https://github.com/containerd/nerdctl https://github.com/khezen/compose-postgres https://rancherdesktop.io/

Is Domain-Driven Design really worth it?

Is Domain-Driven Design really worth it?

If you're reading this blog, you're possibly confused and also have a lot of questions, such as: Is DDD worth it? Is DDD philosophy against the YAGNI principle? Is DDD overrated? Should I bother about DDD? Is DDD a silver bullet and can it be applied to all projects? Is DDD a waste of effort? I'll try to address the above questions in this blog, but first, let me tell you a short rejection tale that inspired me to learn DDD (Domain-Driven Design). Interviewer: How do you identify a Microservice ? Me: Microservice should be small in size and should do only one thing !!! Interviewer: How small ? What's the right size ? Me: I answered blah.. blah... Interviewer: Do you involve domain experts in this process ? Me: No ................................................. After 40 mins of massacre 🤯🤯, the Interviewer said these golden words to me You can leave for the day, HR will get back to you !!! After getting home, I spent about 30 minutes googling the same question and came across the phrases "Bounded Context" and "DDD" on various sites. Before we go into Bounded Context, let's first figure out what DDD is and why we need it? Microservices : Microservices usage is widely spread. It is a modular strategy that functionally decomposes an application into a collection of fine-grained services. When designing large-scale, sophisticated services, microservices are critical. Most of our microservices are now too complicated for a single person or team to manage, comprehend, and implement. A team of developers must break applications into modules that can be built and understood. To understand microservices better, let us take an example of simple E-commerce Microservices architecture Credit: https://www.infoq.com/articles/microservices-aggregates-events-cqrs-part-1-richardson/ The above image shows microservices that are part of an E-commerce application and each microservice has an impermeable boundary(Bounded Context) that is difficult to violate. Also, Microservices must have high cohesion and low coupling to avoid interdependency between microservices and scale independently. Identifying microservices is not as easy as it sounds because you need to have a solid understanding of the domain, transactional boundaries, and a single well-defined purpose. Thankfully, DDD helps you to solve this problem of identifying microservices Analyze the domain: analyze the business domain to understand the application's functional requirements. Eg: Event Storming session Define Bounded Contexts: define the bounded contexts of the domain. Each bounded context contains a domain model that may represent a particular subdomain of the larger application. We will cover this in a while. Define entities, aggregates, and services: Within a bounded context, apply tactical DDD patterns to define entities, aggregates, and domain services. Identify Microservices: Following the above steps will get the desired result Credit: https://docs.microsoft.com/en-us/azure/architecture/microservices/model/domain-analysis First step, We need to analyze the domain !!! But how ???? 1. Analyze the Domain: DDD stands for Domain-Driven Design. It is neither technology nor a tool; rather, it is a software design methodology that focuses on modeling software to fit a domain based on input from domain experts. Eric Evans introduced this concept DDD in his book "Domain-driven design," often known as the "Bluebook." In layman's terms, DDD is a design methodology that enables us to create better and more maintainable software that is domain-focused. Why DDD? Often people try to jump directly into implementation details even before thinking about the domain (Problem Statement) and that's exactly what DDD focuses on, the domain. DDD is a process that involves Domain experts (stakeholders) and the engineering team to discuss in a common language and understand more about the domain and design the system better. Domain: Why domain is so important for DDD other than having the word <domain> as the first word. Let's figure it out!! Domain is an area of knowledge or activity For example, E-commerce, Banking, Mobility, etc. Every Domain contains a set of activities, processes and it consists of experts who are responsible for the execution of these activities and we call them "Domain Experts". Let's take E-commerce as an example, It contains many activities and Processes like Inventory Management UI (Web/Mobile) Delivery Finance Order Management Warehouse Management ................... Subdomain: The domain is very big to understand as a whole. Instead, it is better to break the domain into smaller pieces called subdomains. Subdomains are categorized into three different types: Core Subdomain: Where all main business flows and models are understood and defined. For instance, in an E-commerce application, the main activities we need to support are offer generation, catalog management, etc. Supporting Subdomain: All the activities that are built to support the core subdomain. With regards to the previous example, Shipping, and Backoffice subdomains are supporting subdomains for the core subdomain. Generic Subdomain: Activities that are common across multiple domains but they are not part of the core domain. For example Analytics, Payment. These subdomains can be part of multiple other domains and we can generically build these subdomains to support other domains. Domain Expert: It would be practically difficult for someone to know the entire domain, which is where domain experts come in, and these domain experts are those who are knowledgeable about a certain subdomain. Ubiquitous Language: To better comprehend the domain, we require the assistance of domain experts to model the domain. The Event Storming Activity assists the Engineering team in better understanding the model by including domain experts. During an event storming session, business experts may use business jargon (terms) and the engineering team may use technical jargon. This will require both sides to translate and understand the meaning, and there is a high chance of miscommunication or loss in translation. In the DDD world, domain and technical experts should use the same language, which is called "ubiquitous language." Software that is built on top of this language is easy to understand because it reflects business terminology in the code. 2. Identifying Bounded Context: It's almost impossible to create one model that represents the whole domain. Even if you can, that makes the system so complex and difficult to understand and translate into code. DDD deals with these problems by splitting the domain into independent parts called bounded contexts . A bounded context defines the limit of applicability for ubiquitous language. Since domains and subdomains often become big, it is difficult to model them as one single model.
Instead of a single large model for the domain, you have many smaller ones that are well-defined and have specific boundaries for what they are meant to encapsulate. Also, the meaning of words that are used in a specific bounded context will be different in another bounded context. Confused ???? Credit: https://medium.com/raa-labs/part-1-domain-driven-design-like-a-pro-f9e78d081f10 As Anders gill explained in his blog , the meaning of a specific word varies from one bounded context to another context. The word Serve means something different in a restaurant context compared to a Tennis related context. Set and Pour are other examples as well. If you take the Ecommerce domain as an example, we can identify multiple bounded contexts and the relation between them (context maps) as shown below. "Item" is a common term that can be used in every bounded context and it has a different meaning in every bounded context. So instead of keeping it as a shared or generic model, it is better to treat it separately. The main characteristics of bounded contexts: Each bounded context should have its own domain model. Each bounded context uses its own ubiquitous language. A domain model built for a bounded context is only suitable within its boundaries. We almost reached the end of this article, I hope you got some idea on strategic DDD. Let's go to the main topic of the article? Strategic DDD mainly talks about analyzing a domain (bounded context, domain, subdomains, context mapping, etc.) and Tactical DDD talks about implementation using DDD jargons like (Entity, Aggregate, services, etc.) Is DDD worth it? I feel we cannot say something good or bad without analyzing its benefits and drawbacks. Benefits: Flexibility : When a project is developed using DDD, it is easier to evolve and change things like business processes, implementations, or technological stacks, which gives us a better time to market. Code is organized: DDD can be used in conjunction with Hexagonal architecture, which requires our code to adhere to a predefined structure, allowing developers to test the relevant levels and adjust the code without fear of breaking anything. You can be a Product Minded Engineer: Product-minded engineers are developers that have a strong passion for their product. During the domain analysis phase, you learn more about the product by speaking with stakeholders (product owners); this will help you make crucial decisions when designing the product. In my perspective, a good product-minded engineer can take the product to the next level, and they are the people who can easily transition from individual contributor to manager. Domain logic is in one place: All the business logic(domain layer) would be separated from the infrastructure and the application layer if you are following Hexagonal architecture. so you can focus mainly on the domain layer and test it thoroughly with unit and component tests. Credit: https://fideloper.com/hexagonal-architecture Drawbacks: It is not a silver bullet: DDD only works in complex domains, which are difficult not only in terms of technology but also in terms of business. Projects that are more technical and involve less business interaction, such as developing a generic logging system or platform libraries, are not a suitable match for DDD. The learning curve is steep: You need to have a solid understanding of DDD both strategic and tactical DDD. This is why any team that wants to build software based on DDD should possess good knowledge in this practice. Time and effort: Most of your time will go into understanding the domain and talking to domain experts but sometimes this is really worth understanding more about the domain so that the engineering team can design the system better. It is very easy to do wrong: As I mentioned before DDD is a philosophy and not a technology so it is very easy to do wrong. You should be agile to refactor your design till you get a near-perfect domain model. You should possess good communication skills You should be agile to quickly do some changes to your design Final Verdict: DDD is not a silver bullet; it has its own challenges, so be pragmatic and decide whether to adopt DDD based on the nature of your project. References: https://medium.com/swlh/event-sourcing-as-a-ddd-pattern-fea6de35fcca https://docs.microsoft.com/en-us/azure/architecture/microservices/model/tactical-ddd https://www.infoq.com/articles/microservices-aggregates-events-cqrs-part-1-richardson/ https://medium.datadriveninvestor.com/if-youre-building-microservices-you-need-to-understand-what-a-bounded-context-is-30cbe51d5085 https://docs.microsoft.com/en-us/dotnet/architecture/microservices/microservice-ddd-cqrs-patterns/infrastructure-persistence-layer-design https://medium.com/raa-labs/part-1-domain-driven-design-like-a-pro-f9e78d081f10 https://medium.com/tacta/a-decade-of-ddd-cqrs-and-event-sourcing-74edc8211039 DDD, event sourcing and CQRS – theory and practice https://www.developer.com/design/domain-driven-design-understanding-bounded-context-and-the-context-map/ https://fideloper.com/hexagonal-architecture

Demystified | Concurrency and Parallelism

Demystified | Concurrency and Parallelism

There are a lot of people who get confused with the terms Concurrency and Parallelism and here is an article that solves those confusions/myths. Some of the common questions which people usually get are: What are Concurrency and Parallelism ? Concurrency == Parallelism ? Does Concurrency really Increase the Performance? Can Parallelism exist without Concurrency ? Can Concurrency exist without Parallelism or Can they both exist? Are Concurrency and Parallelism possible in a Single Core (CPU) Machine? Concurrency and Parallelism: as Rob Pike said Concurrency is about dealing lot of things at once, Parallelism about doing lot of things at once "Dealing" and "Doing" is actually creating confusion 😟. Let's talk in layman terms Concurrency means executing multiple tasks at the same time but not necessarily simultaneously. To understand it better take an example of a single-core CPU machine Let's take an example of Web browser processes, OS Scheduler schedules the tasks like UI Process, Rendering Process, Network Processes, etc, and gives a time slice to share the single CPU. Sometimes context switch happens when a process is waiting/blocked for another process (like I/O operation). Context switches are costly and to avoid the cost you need a higher core machine so we can reduce the number of context switches which makes the system run faster. Still not clear, Let's take a real-world example, suppose you are jogging on a nice morning and your shoelace is untied and to tie the lace you need to stop the jog right? You cannot do these two tasks at the same time, you need to finish one and then another but the order of execution is not important. Parallelism : Parallelism is about doing a lot of things at once. Parallelism requires hardware with multiple processing units, essentially. In a single-core CPU, you may get concurrency but NOT parallelism. Let's take the same example of web browser processes, Now all the four processes are running in four cores which makes things run faster. So all we need is parallelism, why concurrency ? Why do we need to think about concurrency ? Because Parallelism comes with a cost 💰💰💰💰 You need hardware with multiple processing units, essentially You need to split the tasks in such a way that they are independently executable computations(functions) with no interdependency on each other Concurrency enables Parallelism, Concurrency is about structuring a program into multiple independent pieces(functions) which can run independently but not necessarily parallel. Why achieving concurrency is hard? Make sure the program split into independently executable functions Threads (goroutines) have to share the same memory, so you have a high chance of RaceConditions, Deadlocks. Thread safety (locks, mutexes) comes with a penalty of performance Testing concurrent code is hard Concurrency and Parallelism existence: There were a lot of questions about Concurrency and Parallelism's existence and its combinations. I will try explaining better with the below table. ​Concurrency ​Parallelism ​Meaning ​ ✅ ​ ✅ ​An application that can run multiple tasks concurrently in a multi-core CPU machine. ​ ❌ ​ ✅ An application only works on one task at a time, and this task is broken down into subtasks that can be processed in parallel. However, each task/subtask is completed before the next task is split up and executed in parallel. ​ ✅ ​ ❌ An application runs more than one task at the same time, but no two tasks are executed at the same time instant. Single Core CPU machine is a classic example of this ​ ❌ ​ ❌ ​ An application processes all tasks one at a time, sequentially. I am going to talk more about concurrency in my next articles, stay tuned to hangoutdude References: https://howtodoinjava.com/java/multi-threading/concurrency-vs-parallelism/ http://tutorials.jenkov.com/java-concurrency/concurrency-vs-parallelism.html https://medium.com/@itIsMadhavan/concurrency-vs-parallelism-a-brief-review-b337c8dac350 https://www.udemy.com/course/concurrency-in-go-golang

How to visualize/display raw Postgres Ltree data?

How to visualize/display raw Postgres Ltree data?

Why do we need Ltree? Suppose you had a hierarchical data structure in your application something like the above. How would you save it in a database and how do you represent the above complex tree into flat rows and columns? If you think you can solve it with foreign key relationships then it is not easy because maintaining this tree and retrieving the data using recursive queries is not easy and it is going to be very expensive if the above tree getting modified. Another approach would be using graph-oriented databases like neo4j which are designed to solve these use-cases, But you don't want to leave Postgres behind because you already had a working, well-tested application. Ltree: Ltree is a data type that is used to represent the hierarchical tree-like structures in flat rows and columns in Postgres DB For more info-refer this https://www.postgresql.org/docs/9.1/ltree.html I have been using Ltree Extension for quite a long time to store hierarchical data and it is working perfectly fine. Sample Ltree Data looks like this for the above tree. Source: https://gist.github.com/jinagamvasubabu/14dbce3ca89199e083488d80f2d80d64 path column does all the magic, "1.2.9 " means the parent of Telegram(9) is Messenger(2) and parent of Messenger(2) is Apps(1) Why do we need LtreeVisualizer? The simple answer is Postgres doesn't provide any UI to visualize the Ltree Data 😟 Ltree Labels are separated using Dot like 1.2.3.4 and it is not easy to visualize like a tree. I have around 5000 Nodes in my production data and it is very tough for me to visualize any subtree for debugging purposes even though Ltree provides good querying capability. So I thought of writing one library to visualize the above data using Dot Graph and Graphviz . DOTGraph is a graph description language, using this language we can represent Directed, Undirected, and FlowCharts. https://en.wikipedia.org/wiki/DOT_(graph_description_language) digraph graphname {
a -> b -> c;
b -> d;
} Graphviz is open source graph visualization software and it can visualize DOT Graphs How to use it? DB Way (Connect to DB and fetch the Ltree Data): LtreeVisualizer is capable of connecting to your DB using gorm and can fetch the data according to provided query and convert it to LtreeData which LtreeVisualizer can understand. Simple Example: Note: Query resultset should contain id, name, path, please use an alias if your column names are different Using Interim JSON: You need to prepare your Ltree Data as per the below struct //VisualizerSchema Contract to send to ltreevisualizer
type VisualizerSchema struct {
Data []data `json:"data"`
}

type data struct {
ID int32 `json:"id"`
Name string `json:"name"`
Path string `json:"path"`
Type string `json:"type"`
} That's it, pass this data to LtreeVisualizer and you convert it to either DOT Graph string or an Image. Refer Ltree Visualizer ReadMe.Md for more info on how to use Examples: You can refer to examples https://github.com/jinagamvasubabu/ltreevisualizer/tree/main/examples to play around and see how it works. Conclusion: Ltree is a great extension of Postgres DB and the only thing which is lacking is UI to visualize this data. I hope LtreeVisualizer can fill this gap. Let me know in the comments section if you want a dedicated Ltree tutorial. If you like this repo and my idea, please give a ⭐ for this library https://github.com/jinagamvasubabu/ltreevisualizer References: http://patshaughnessy.net/2017/12/11/trying-to-represent-a-tree-structure-using-postgres

Measuring Performance Metric 📈📈

Measuring Performance Metric 📈📈

Let's consider that you have developed a complex backend system that is battle-tested with all the functional requirements, and you are ready to hit the production and take a bonus for this hard work 💰💰💰. But just before the deployment, your boss sent an email and asked you to send these details. How is Mean, Median, Max, p90, p95,p99 percentile latency looks like? Also at what load did you test And, you have no idea about these Buzzwords 😳 Metric: The behavior of software systems is hard for humans to understand, so we need some metrics to judge whether the system is running fine or not in particular scenarios. Metrics are a proxy for reality !!!! Latency Vs Response time: Latency and Response time are often used synonymously but they are different. Response Time: Response time is what the client sees; besides the actual time to process the request (the service time), it includes network delays and queuing delays. Response times cannot be the same at all times. For example, refer to Figure 1-1. Let's suppose your service returns a response in 100ms and you can not guarantee that you will get 100ms on every request because delays can be included. We, therefore, need to think that response time is not a single number, but a distribution of values that you can measure. Figure 1-1: Credits: Designing Data-Intensive Applications by Martin Kleppmann Latency: Latency is the duration that a request is waiting to be handled--during which it is latent, awaiting service. In simple terms, latency is the time taken for a request to happen, and this time will include waiting time. Why Averages won't work? Average response times don't give much information. Let's take an example of a system whose response times look like below.
Response times of 10 requests in ms:
60, 120, 30, 20, 40, 55, 25, 65, 90, 920

Average: 143ms (Rounded)
The average response time is 143 ms, but it does not give enough information about how many users actually experienced this. Frankly speaking, your system is performing well, but due to that one request (920ms), your average went up. Use Percentiles (p50, p95, p99, p99.999...) : Let's take the same example, Sort the response times Take 50/100th Value for the 50th percentile and multiply it with the number of requests (0.5*10 = 5) Consider Nth index value as 50th Percentile or p50
Response times of 10 requests in ms:
60, 120, 30, 20, 40, 55, 25, 65, 90, 920

1. Sort it:
20, 25, 30, 40, 55, 60, 65, 90, 120, 920

2. Take 50/100th value:
(50/100)*10 = 0.5*10 = 5

3. Take Nth index value:
5th index value is 55ms

50th percentile or p50: 55ms (taking 1 as the starting index)
50th Percentile or p50 value is 55 ms, which means half of your requests return response in less than 55 ms and the other half returns a response in more than 55 ms. Why do we need this Metric? Latency Percentiles are often used in Service Level Objectives (SLOs) and Service Level Agreements (SLAs). For example, you are running a company XYZ and you have a contract with your clients that defines the expected performance and availability of your service, like below: 99.99% uptime p99th percentile is 900ms p95th percentile is 500ms These Metrics set expectations for the clients and allow customers to demand a refund if the SLA is not met by the company. Tail Latency: To figure out how bad your outliers are, you need to look at higher percentiles like p95, p99, p99.9. Higher percentiles of response times are also called Tail Latencies, and this is important because they directly affect the user experience. For example, if the p99th percentile response time is 1.5 seconds, then 99 out of 100 requests will get completed within 1.5 seconds, and 1 out of 100 takes more than 1.5 seconds. Why Tail Latency is Important? There was a study conducted by Amazon that showed a 100ms increase in response time reduced sales by 1%. Similarly, Google conducted a study where it was shown that a half a second delay in load time reduced site traffic by 20%. As per Amazon, customers with the slowest requests are often those who have the most data on their accounts because they have made many purchases, and those customers are considered valuable customers, so losing that valuable customer because of the slowest request is an expensive mistake for a company. Optimizing Tail Latency: Optimizing the tail latency is not as easy as you think. For example, the 99.99th percentile (slowest 1 in 10,000 requests) was deemed too expensive and did not yield enough benefit for Amazon's purposes because reducing response times at very high percentiles is difficult because they are easily affected by random events outside of your control. Percentiles in Practice: The approach which I mentioned earlier is considered naive and some algorithms can calculate a good approximation of percentiles at minimal CPU and memory cost, such as Hdr Histograms https://github.com/HdrHistogram/HdrHistogram Forward Decay http://dimacs.rutgers.edu/~graham/pubs/papers/fwddecay.pdf t-digest https://github.com/tdunning/t-digest Hdr Histograms: Gil Tene is the GC and Latency Guru and he created this library in Java, and later ported it to other languages like Erlang, Go, etc. It is opensource ❤️ Also, if you haven't watched any of his talks about latency or GC, please watch them. The below code takes response times in microseconds and can measure the percentiles. Credits: https://github.com/HdrHistogram/hdrhistogram-go Do not re-invent the wheel: In the real world, you will be deploying your service on multiple servers across the globe for scalability and availability, so calculating these percentiles by ourselves is not practical. There are a lot of technologies which can help us, and those are Prometheus https://prometheus.io/docs/practices/histograms/ Instana https://www.instana.com/blog/how-to-measure-latency-properly-in-7-minutes/ Elasticsearch https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-percentile-aggregation.html Conclusion: By this time, you would have understood that latency is an important metric for a company and it should be given equal weightage with your use-case testing. References: https://www.oreilly.com/library/view/designing-data-intensive-applications/9781491903063/ "How NOT to Measure Latency" by Gil Tene https://github.com/HdrHistogram/hdrhistogram-go https://medium.com/@djsmith42/how-to-metric-edafaf959fc7 https://blog.bramp.net/post/2018/01/16/measuring-percentile-latency/ https://www.elastic.co/blog/averages-can-dangerous-use-percentile https://www.manageengine.com/network-monitoring/faq/95th-percentile-calculation.html https://www.section.io/blog/preventing-long-tail-latency/ https://perspectives.mvdirona.com/2009/10/the-cost-of-latency/

Pinpoint anywhere on earth for better navigation

Pinpoint anywhere on earth for better navigation

Have you ever faced an issue telling your address to a delivery agent? Do you still depend on landmarks for navigation ? These three words will save you from this and it's not Oh My God 🤣🤣 Problem: Street addresses/GPS are not accurate enough to specify precise locations , such as building entrances, and parking exits because GPS with cellular tower data will have an RMSE(Root Mean square Error) of 10 meters and this number can vary depending upon satellite positioning, hardware of GPS coordinates receiver, and rural areas, etc Why is GPS alone not enough? GPS ( Global Positioning System) mainly depends on strong signals received from satellites positioned across the globe, every smartphone has a receiver (inbuilt) to receive these signals and also uses cellular tower signals to pinpoint your correct accurate location if the object is in motion. Many GPS devices ideally need to receive signals from at least 7 or 8 satellites to calculate location to within about 10 meters . With fewer satellites the amount of uncertainty and inaccuracy increases. With less than 4 satellites, many GPS receivers struggle to produce accurate location estimates and will report “GPS signal lost” at points during the route. Imagine pinpointing Rural areas because there is a high chance of fewer satellites. Source: https://hellotracks.com/en/blog/How-to-Improve-your-GPS-Accuracy/ What three words? Divided the entire earth into 3 meters square Gives a randomised 3-word unique string separated by a dot "moons.inflict.rental" which point to Shibuya, Tokyo Share these three words to correctly identify any location on Earth Credits: https://what3words.com/about How to use it? 1. Download the app "what3words" Android: https://play.google.com/store/apps/details?id=com.what3words.android&hl=en_IN&gl=US IOS: https://apps.apple.com/gb/app/what3words/id657878530 2. Give location access permission to the app and tap on the GPS icon or satellite mode and find your grid and tap on to see your three words Source: https://what3words.com 3. You can use these three words to navigate with any Map Apps like Google Maps, Apple Maps, etc Source: https://what3words.com Interesting applications: I have taken some of the interesting use-cases which are built on top of what three words. Postal Services: Mongolia changed its traditional postal code system to what three words to identify a particular address https://qz.com/705273/mongolia-is-changing-all-its-addresses-to-three-word-phrases/ Emergency Situations: Victims can share their three words to helpers in emergency situations like being stuck in the amazon forest etc. This app saved many lives https://what3words.com/what3words-for-emergencies-real-life-stories/ Ride-Hailing: Ride-Hailing Apps like Careem, Cabify, Addison Lee, Easy Taxi, etc started supporting " what three words" for better navigation Logistics & Delivery: "What three words" is a boon to logistics and Delivery because delivery agents can easily navigate to customer addresses without the need to call customers multiple times asking for landmarks ........ refer to this document for more use cases https://what3words.com/products/ Conclusion: "What three words" is an interesting idea to solve the current navigation problems, so why late? download the app today 🥳 References: https://what3words.com https://hellotracks.com/en/blog/How-to-Improve-your-GPS-Accuracy/ https://qz.com/705273/mongolia-is-changing-all-its-addresses-to-three-word-phrases/ https://www.bbc.com/news/uk-england-49319760

Beware of slices in Golang

Beware of slices in Golang

Often developers get confused with Slices with Arrays and maybe because of its syntax. Slices are more powerful than traditional arrays but great power comes with great responsibility. Before jumping into the main topic today, let's spend some time understanding the significant difference between Arrays and Slices. What is Slice and Array ? Arrays : Arrays are typed collections with a fixed size.
var arr [N]T // N - Size and T is a primitive Type

Eg: arr := [3]int32{1,2,3,4}
Arrays are fixed-length hence size cannot be changed. Arrays are value types not reference types in Go. This means that when they are assigned to a new variable, a copy of the original array is assigned to the new variable.
arr1 := [ 3 ] int32 { 1 , 2 , 3 }
arr2 := arr1
arr2[ 2 ] = 22

fmt . Println (arr2[ 2 ]) //Prints 22
fmt . Println (arr1[ 2 ]) //Prints 3
Slices: Slices don't own any data instead it's just a reference to another array. Slices hold a pointer to an existing array, so modifying any value in the slice may change the existing array (there is a catch here, will explain a bit later)
arr1 := [3]int32{1,2,3}
slice := arr1[:]
slice[1] = 11

fmt.Println(arr1[1]) //Prints 11
fmt.Println(slice[1]) //Prints 11
Slices are dynamic, meaning new elements can be easily added using the append function. You can think of a Slice as a struct ( SliceHeader )
type SliceHeader struct {
Data uintptr
Len int
Cap int
}
Here Data is a pointer to the first element(index 0 ) of the existing array because Slice doesn't hold any data. All Fine!!! what's the problem ? 1. Garbage collection As you know, Slices hold the reference to the existing array. As long as the Slice is in use, the existing array cannot be garbage collected. Let's suppose we have an array containing 10000 elements or objects and a slice wants a small part of it for processing. let's suppose 11 of them. arr := [10000] int32 { 1 , 2 , 3,..... }
slice := arr[:11] //holds the reference to the existing array The important thing to note here is that this existing array won't be garbage collected because the Slice has a reference to the existing array. Looks like a problem, so what's the solution ? One way to solve this is to create a new copy of the slice Output: arr address: 0xc000114000 [1 2] &{Data:824634851328 Len:2 Cap:5} [1 2] &{Data:824634867792 Len:2 Cap:2} if you closely look at the output Line No 1 - if you decode 0xc000114000 to decimal then you will get 824634851328 which is the data pointer address of tempSlice (refer LineNo 2) , which shows that tempSlice is referring to arr LineNo 3 - The Data pointer address of the slice is 824634867792 which is different and it no longer depends on arr or tempSlice . so in the next GC cycle, arr and tempSlice will get garbage collected. 2. Capacity planning: Slices provide the flexibility to add new elements dynamically unlike Arrays. Note: getSliceHeader is a helper method that helps us to view the sliceHeader Output: before appending '0': &{Data:0 Len:0 Cap:0} After appending '0': &{Data:824633811136 Len:1 Cap:1} before appending '1': &{Data:824633811136 Len:1 Cap:1} After appending '1': &{Data:824633811248 Len:2 Cap:2} before appending '2': &{Data:824633811248 Len:2 Cap:2} After appending '2': &{Data:824633819424 Len:3 Cap:4} before appending '3': &{Data:824633819424 Len:3 Cap:4} After appending '3': &{Data:824633819424 Len:4 Cap:4} before appending '4': &{Data:824633819424 Len:4 Cap:4} After appending '4': &{Data:824633843968 Len:5 Cap:8} Final Result is: [0 1 2 3 4] You may get these questions when you see the output: Why Data (pointer) is changing when the capacity is exhausted? Why capacity is getting doubled when Len reaches Cap? That's a good question, let's talk about the reasons for that behavior: nil slice starts off with empty capacity (check LineNo 1) The capacity of the slice doubles while attempting to append a new item when its capacity and length are equal (check LineNo 4, LineNo 6) When the capacity is doubled, we can also observe that the pointer to the backing array (i.e. the Data field value of reflect.SliceHeader struct) changes. Wait, What's the problem with this ? It's clear that you get a new backing array every time Len reaches the Cap if you don't have the capacity planned ahead, leading to an increase in time Complexity. Let's see some code to understand this better Output: before appending '0': &{Data:824633851984 Len:0 Cap:10} After appending '0': &{Data:824633851984 Len:1 Cap:10} before appending '1': &{Data:824633851984 Len:1 Cap:10} After appending '1': &{Data:824633851984 Len:2 Cap:10} before appending '2': &{Data:824633851984 Len:2 Cap:10} After appending '2': &{Data:824633851984 Len:3 Cap:10} before appending '3': &{Data:824633851984 Len:3 Cap:10} After appending '3': &{Data:824633851984 Len:4 Cap:10} before appending '4': &{Data:824633851984 Len:4 Cap:10} After appending '4': &{Data:824633851984 Len:5 Cap:10} [0 1 2 3 4] Data (Pointer) is always pointing to 824633851984 because we initialize the backing array with a capacity of 10 and this makes sure that all the append operations have run in O(1) time. 3. Becareful with append: You might be thinking slices are dynamic and append provides a way to add new elements to the end. Yes, that's correct !!! But if you are not careful while appending then you may end up getting some bugs that are very hard to track down. Let's take an example You guess the output would be: [0 1 2 3] [0 1 2 4] But the output is: [0 1 2 4] [0 1 2 4] But How ? let me explain, I explained in previous sections that the slice Header will have Len and Cap, and capacity will get doubled when Len reaches capacity remember? Let's take the help of the getSliceHeader function and instrument the code Output: x is initialized = [] &{Data:0 Len:0 Cap:0} After appending 0, x = [0] &{Data:824634433544 Len:1 Cap:1} After appending 1, x = [0 1] &{Data:824634433616 Len:2 Cap:2} After appending 2, x = [0 1 2] &{Data:824634523712 Len:3 Cap:4} After appending 3, x = [0 1 2] &{Data:824634523712 Len:3 Cap:4} After appending 3, y = [0 1 2 3] &{Data:824634523712 Len:4 Cap:4} After appending 4, x = [0 1 2] &{Data:824634523712 Len:3 Cap:4} After appending 4, z = [0 1 2 4] &{Data:824634523712 Len:4 Cap:4} After appending 4, y = [0 1 2 4] &{Data:824634523712 Len:4 Cap:4} I will clear the confusion around append using the above output. x is initialized = [] &{Data:0 Len:0 Cap:0}, Empty slice has been created ......... After appending 2, x = [0 1 2] &{Data:824634523712 Len:3 Cap:4}, After appending 2 the slice capacity got doubled because size reached its Len and Capacity is 4. That means we have one extra space to accommodate to reach the Cap After appending 3, y = [0 1 2 3] &{Data:824634523712 Len:4 Cap:4}, After appending 3 to x the Len and Cap has been reached to 4 for y slice header. Still, the x slice header is still having Len 3 and Cap 4 which means there is one more space left to reach the Cap After appending 4, z = [0 1 2 4] &{Data:824634523712 Len:4 Cap:4}, After appending 4 to x the Len and Cap has been reached to 4 for z slice header. Still, the x slice header is still having Len 3 and Cap 4 which means there is one more space left to reach the Cap Hope it's clear now, x Len is still pointing to 3 and that's the reason for the override How to fix these ? One way to fix this is by using a copy Output: [0] &{Data:824634433544 Len:1 Cap:1} [0 1] &{Data:824634433616 Len:2 Cap:2} [0 1 2] &{Data:824634523712 Len:3 Cap:4} [0 1 2 4] &{Data:824634425440 Len:4 Cap:6} [0 1 2 3] &{Data:824634425488 Len:4 Cap:6} FYI: Copy function won't copy anything if the slice is empty/nil, so you should initialize a slice with some length, check Line No 17,18 for more info. 4. Append on a Sliced Slice: sometimes appending on a sliced slice can modify the original slice Output: a: [1 2 3 4 5], sliceHeader: &{Data:824634425392 Len:5 Cap:5} b: [3 4], sliceHeader: &{Data:824634425408 Len:2 Cap:3} a: [1 2 3 4 20], sliceHeader: &{Data:824634425392 Len:5 Cap:5} b: [3 4 20], sliceHeader: &{Data:824634425408 Len:3 Cap:3} you might be thinking why a has been changed ? To understand more about the issue, let's understand how sliced slice [2:4] works internally When you slice a slice using slice expression like [low: high] sliced slice gets a new capacity. i.e; Cap = Len - low; (Len = Original Slice length) b gets a capacity, i.e; 3 (5-2) from the original slice, and this space is shared between the original slice and the sliced slice. I think now you got the answer, why changing the b slice has changed the a slice. One way to fix this is Output: a: [1 2 3 4 5], sliceHeader: &{Data:824635383808 Len:5 Cap:5} b: [3 4 5], sliceHeader: &{Data:824635383824 Len:3 Cap:3} a: [1 2 3 4 5], sliceHeader: &{Data:824635383808 Len:5 Cap:5} b: [3 4 5 20], sliceHeader: &{Data:824635383904 Len:4 Cap:6} In the above example, we can see that the capacity and length of slice b was 3, and calling append on slice b triggered the grow logic which meant that the values had to be copied to a new array with double the capacity i.e; 6, and that's the reason Original Slice a was not impacted because of this change Another way to fix the above issue would be to use a copy instead of an append ;) Conclusion: Slices in go are very powerful and they are memory-efficient, but unlike arrays, they are not straight forward and devs need to be extra careful while using slices or else you end up wasting a lot of time to track down bugs ;) References: https://www.sohamkamani.com/golang/arrays-vs-slices/ https://blog.golang.org/slices-intro https://stackoverflow.com/questions/44152988/append-not-thread-safe https://www.tugberkugurlu.com/archive/working-with-slices-in-go-golang-understanding-how-append-copy-and-slicing-syntax-work

Split/Slice an Array into chunks (golang)

Split/Slice an Array into chunks (golang)

Below code talks more :) Above code split an array into multiple chunks based on the chunk size. Why can't i use math.Min ? Because golang Math function mainly supports float64 instead of int and one more disadvantage of golang is not having support of generics Looks like go is getting support of generics in go 2.0. https://github.com/golang/go/issues/25597

bottom of page