DEV Community: Alexey Yuzhakov

Modern Books for Software Engineering Managers

Alexey Yuzhakov — Wed, 22 Jan 2025 10:26:32 +0000

The main goal of this article is to share the list of books that are worth reading and practical for the Software Engineering Manager role or similar, where the mix of technical skills and people management comes into play.

There are many interesting books, but I would like to emphasize those published over the last few years. Our industry evolves very quickly. Classic books may still be good and valuable, but considering time limits, modern technologies, and approaches, it’s better to start with something as practical as possible to get up to speed.

Last but not least, I saw so many threads on Twitter where various folks promoted “curated lists of books” without reading a page of any of them. Yes, I read all the mentioned books and even more 🙂 So, I suggest the books I read from cover to cover and tried to apply pieces of advice in practice.

TL;DR

The list of books published recently that I found worth reading:

Become an Effective Software Engineering Manager
Debugging Teams: Better Productivity through Collaboration
The Manager's Path: A Guide for Tech Leaders Navigating Growth and Change
Managing Humans: Biting and Humorous Tales of a Software Engineering Manager
Leading Effective Engineering Teams
Radical Candor
No Rules Rules: Netflix and the Culture of Reinvention
Engineering Management for the Rest of Us
An Elegant Puzzle: Systems of Engineering Management
The Missing README: A Guide for the New Software Engineer

Become an Effective Software Engineering Manager

My most loved book is “Become an Effective Software Engineering Manager.” It competes with “The Manager’s Path” in the category of “comprehensive guide.” The book is rather big but not bloated, with an excellent focus on bringing practical advice and sharing the author’s hands-on experience on the path of becoming an engineering manager. Best practices are explained as “tools” you must use and “rules” to follow. The book is written in a “first-person shooter” manner. You start at the beginning of this career path and “attack” the problems you observe. It is easy to read, and interesting to see what will happen next. I would not say it is focused on effectiveness, as stated in the title, but using all presented “tools” should level up you as a manager.

Debugging Teams

Another excellent book, “Debugging Teams,” was written by two Google managers. It’s just ~200 pages long and has many satiric illustrations that make reading easy and fun. I found this book very practical and useful. Despite the fact that it was written by people working in a big tech company, it’s suitable for managers of 3-5 member teams as well. This book is worth reading if you are concerned about the team's productivity and how to improve it. The main theme of the book is an interconnection of three traits: humility (don’t put yourself in the first place), respect (to the team, organization, users), and trust (to the same things). The authors did a great job by providing a lot of examples to demonstrate the influence of all these aspects on the productivity of the team.

The Manager's Path

This book is one of the most popular recommendations. I suggest choosing “The Manager’s Path” if you have time to read only one book. The book covers a wide range of topics and sheds light on the manager’s path from tech lead to CTO. You can use it as a handbook and look for the answers to particular questions and use step-by-step instructions. So, the book is very practical. It also has a lot of “Ask CTO” snippets with answers to controversial questions like “I still want to write code” or “hiring interns.” Every chapter ends with “assessing your own experience” exercises. If you provide fair answers, it should definitely help you understand your areas for improvement.

Managing Humans

If you worked as an engineering manager for several years, you would like to read the “Managing Humans” book. These “biting and humorous” stories are so good. The author worked at famous places like Borland, Netscape, Apple, Palantir, Pinterest, and Slack and faced many different situations. But these are not only “funny stories.” This is knowledge sharing by a very experienced manager. If you have worked as an engineer manager for a while, there is a probability that some of these stories will be very familiar to you. Do some of your engineers hate you? The story “Wallace Hates Me” could help you find the cue and what to do next. Are you struggling to find the time “to think”? There will be insights on how to distinguish the real thinking process from just reacting. Running the meetings is always a challenge. There will be a classification of “meeting creatures” like “Laptop Larry,” “Mr. Irrelevant,” “Chatty Patty,” and many others. There is a high chance you have met them already in your meetings. The book starts with a story titled “Don’t be a Prick” and holds the attention till the last story. So, in general, this book is an interesting reflection on how things work in engineering management.

Leading Effective Engineering Teams

“Leading Effective Engineering Teams” is one more book from Googler. This time, the word “effective” in the title plays a central role in the book. If you have read about “Project Oxygen” and “Project Aristotle” before, you will find a lot of repetition here. If you haven’t read about them, the book's author provided a better and more detailed explanation of that projects’ key findings than other publicly available articles. Plus, he provides wisdom from his decade-long Google experience on building truly effective teams. The book is full of “bullet-style” instructions and suggestions that could be helpful. However, I am personally not a big fan of that writing style as it quickly becomes hard to read and follow these neverending lists. Moreover, I got the impression that many things in the book, like “ask good questions” and “use the right tools,” look so idealistic. The equations like “hire super talented people, build a highly effective team, and create a valuable product” are easier to proclaim than to solve in real life. But it’s still worth reading. At least if you want to know what an ideal engineering manager’s life should look like.

Radical Candor

I tried starting to read this book three times and finished it only on the third attempt. Eventually, I realized that if I pushed through the first 30-40 pages, I could get used to the writing style and keep reading. So, if you are struggling like me, try to use this piece of advice.
It’s very hard to build an effective team of talented engineers. But it’s even harder to retain them. The author of the book introduces the “Radical Candor” philosophy and describes its practical application in Apple and Google. How to be a great boss without losing your humanity? This is a tough question, and you can find not only a theoretical background but practical advice on how to answer it. I wouldn’t say this is a book for recently promoted managers, but for people who have worked as managers for a while and would like to level up their management skills.

No Rules Rules

Although the word “radical” was in the title of the previous book, “No Rules Rules” is the most radical book I have read about engineering management. The book describes how processes are organized at Netflix. Most probably, you will be unable to apply all these practices in your company. Maybe even none of them. But you will definitely be impressed by how some ordinary work can be organized in very different ways. For example, if you want to spend $60 on some cloud service and you have to prepare a written justification, ask three different people for approval, and wait for two weeks, it’s not a surprise why your work goes very slowly, and there is no room for innovations. On the other hand, giving people the freedom to spend the company’s money as they wish could sound uncomfortable, at least.
One of the often cited parts of this book is a “Keeper Test.” If some of your engineers wanted to leave, would you fight for them? If the answer is no… Well, just read the book to get the answer. Even if you are not ready for radical steps, doing the “Keeper Test” is an excellent exercise to be prepared for tough times.
No Rules Rules are about rules. But absolutely different rules. After reading this book, you may realize these rules are so insane to be applied in the real world. However, Netflix is a successful company and an absolutely real one.
This book should be interesting, especially for top managers and company owners who struggle with a lack of innovations and a slow pace of work.

Engineering Management for the Rest of Us

“Engineering Management for the Rest of Us” is an interesting book, focusing mainly on the fact that most engineering managers had no prior education in management and struggled with crises and many unfamiliar problems after becoming managers.
I had the strange impression that the chapters were a compilation of presentations. I know she is a presenter, but I have never seen her presentations. The book is quite short, with many topics covered very briefly. Despite that fact, the author tries to provide a lot of useful examples. If you are looking for a short introduction to the topic of engineering management, this is the right book for you.

An Elegant Puzzle

First of all, “An Elegant Puzzle” book should be interesting for a manager of managers. The author worked at several famous companies, like Uber and Stripe, and shared his experience in this book. If you struggle with questions about the size of the engineering team or how to select project leads, you will find practical suggestions with detailed explanations. I found the chapters are quite independent of each other, and there is no strong storyline. Most probably because, initially, these articles were blog posts. However, there is a benefit to such an approach as well. You can use this book as a handbook and read only about the topic you are interested in to get practical advice and step-by-step instructions. Software engineering, in general, and engineering management, in particular, is often puzzling. Books like this help a lot in becoming good at solving such puzzles.

The Missing README

At first glance, this book does not look like it is for managers but for new software engineers. Why do I suggest it? The reason is simple: sooner or later, every manager faces onboarding a new member to the project. Every project has its specifics. But there are a lot of common things as well. Instead of reinventing the wheel by writing your own instructions for newcomers, you can utilize the wisdom from this book. With great clarity, the authors describe almost every aspect software engineers will meet during their work and provide ready-to-use instructions. During my work, I found a lot of practical examples from this book that helped me save time during discussions with engineers.

Final Thoughts

Do not fool yourself: reading about something is not the same as doing that. But knowledge can be very powerful. If you are facing a challenging situation and are already armed with knowledge, you’re increasing your odds of finding the right solution and making a more informed decision.

I’ve been working as an engineering manager for a long time, and I wish I read all these books before I started. It would have saved me a lot of time and nerves. Learning only from your own mistakes is very expensive and sensitive in management. You can fix a software bug you made earlier, but it’s much more complicated to reverse the decision you made to hire or fire someone.

I would be very glad if you found some of these books interesting and read them. If you would like to share your favorite book about software engineering management, feel free to do so in the comments.

JIRA Analytics with Pandas

Alexey Yuzhakov — Fri, 23 Aug 2024 11:04:39 +0000

Problem

It's hard to argue Atlassian JIRA is one of the most popular issue trackers and project management solutions. You can love it, you can hate it, but if you were hired as a software engineer for some company, there is a high probability of meeting JIRA.

If the project you are working on is very active, there can be thousands of JIRA issues of various types. If you are leading a team of engineers, you can be interested in analytical tools that can help you understand what is going on in the project based on data stored in JIRA. JIRA has some reporting facilities integrated, as well as 3rd party plugins. But most of them are pretty basic. For example, it's hard to find rather flexible "forecasting" tools.

The bigger the project, the less satisfied you are with integrated reporting tools. At some point, you will end up using an API to extract, manipulate, and visualize the data. During the last 15 years of JIRA usage, I saw dozens of such scripts and services in various programming languages around this domain.

Many day-to-day tasks may require one-time data analysis, so writing services every time doesn't pay off. You can treat JIRA as a data source and use a typical data analytics tool belt. For example, you may take Jupyter, fetch the list of recent bugs in the project, prepare a list of "features" (attributes valuable for analysis), utilize pandas to calculate the statistics, and try to forecast trends using scikit-learn. In this article, I would like to explain how to do it.

Preparation

JIRA API Access

Here, we will talk about the cloud version of JIRA. But if you are using a self-hosted version, the main concepts are almost the same.

First of all, we need to create a secret key to access JIRA via REST API. To do so, go to profile management - https://id.atlassian.com/manage-profile/profile-and-visibility If you select the "Security" tab, you will find the "Create and manage API tokens" link:

Create a new API token here and store it securely. We will use this token later.

Jupyter Notebooks

One of the most convenient ways to play with datasets is to utilize Jupyter. If you are not familiar with this tool, do not worry. I will show how to use it to solve our problem. For local experiments, I like to use DataSpell by JetBrains, but there are services available online and for free. One of the most well-known services among data scientists is Kaggle. However, their notebooks don't allow you to make external connections to access JIRA via API. Another very popular service is Colab by Google. It allows you to make remote connections and install additional Python modules.

JIRA has a pretty easy-to-use REST API. You can make API calls using your favorite way of doing HTTP requests and parse the response manually. However, we will utilize an excellent and very popular jira module for that purpose.

Tools in Action

Data Analysis

Let's combine all the parts to come up with the solution.

Go to the Google Colab interface and create a new notebook. After the notebook creation, we need to store previously obtained JIRA credentials as "secrets." Click the "Key" icon in the left toolbar to open the appropriate dialog and add two "secrets" with the following names: JIRA_USER and JIRA_PASSWORD. At the bottom of the screen, you can see the way how to access these "secrets":

The next thing is to install an additional Python module for JIRA integration. We can do it by executing the shell command in the scope of the notebook cell:



!pip install jira

The output should look something like the following:



Collecting jira
  Downloading jira-3.8.0-py3-none-any.whl (77 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 77.5/77.5 kB 1.3 MB/s eta 0:00:00
Requirement already satisfied: defusedxml in /usr/local/lib/python3.10/dist-packages (from jira) (0.7.1)
...
Installing collected packages: requests-toolbelt, jira
Successfully installed jira-3.8.0 requests-toolbelt-1.0.0

We need to fetch the "secrets"/credentials:



from google.colab import userdata

JIRA_URL = 'https://******.atlassian.net'
JIRA_USER = userdata.get('JIRA_USER')
JIRA_PASSWORD = userdata.get('JIRA_PASSWORD')

And validate the connection to the JIRA Cloud:



from jira import JIRA

jira = JIRA(JIRA_URL, basic_auth=(JIRA_USER, JIRA_PASSWORD))
projects = jira.projects()
projects

If the connection is ok and the credentials are valid, you should see a non-empty list of your projects:



[<JIRA Project: key='PROJ1', name='Name here..', id='10234'>,
 <JIRA Project: key='PROJ2', name='Friendly name..', id='10020'>,
 <JIRA Project: key='PROJ3', name='One more project', id='10045'>,
...

So we can connect and fetch data from JIRA. The next step is to fetch some data for analysis with pandas. Let’s try to fetch the list of solved problems during the last several weeks for some project:



JIRA_FILTER = 19762

issues = jira.search_issues(
    f'filter={JIRA_FILTER}',
    maxResults=False,
    fields='summary,issuetype,assignee,reporter,aggregatetimespent',
)

We need to transform the dataset into the pandas data frame:



import pandas as pd

df = pd.DataFrame([{
    'key': issue.key,
    'assignee': issue.fields.assignee and issue.fields.assignee.displayName or issue.fields.reporter.displayName,
    'time': issue.fields.aggregatetimespent,
    'summary': issue.fields.summary,
} for issue in issues])

df.set_index('key', inplace=True)

df

The output may look like the following:

We would like to analyze how much time it usually takes to solve the issue. People are not ideal, so sometimes they forget to log the work. It brings a headache if you try to analyze such data using JIRA built-in tools. But it's not a problem for us to make some adjustments using pandas. For example, we can transform the "time" field from seconds into hours and replace the absent values with the median value (beware, dropna can be more suitable if there are a lot of gaps):



df['time'].fillna(df['time'].median(), inplace=True)
df['time'] = df['time'] / 3600

We can easily visualize the distribution to find out anomalies:



df['time'].plot.bar(xlabel='', xticks=[])

It is also interesting to see the distribution of solved problems by the assignee:



top_solvers = df.groupby('assignee').count()[['time']]
top_solvers.rename(columns={'time': 'tickets'}, inplace=True)
top_solvers.sort_values('tickets', ascending=False, inplace=True)

top_solvers.plot.barh().invert_yaxis()

It may look like the following:

Predictions

Let's try to predict the amount of time required to finish all open issues. Of course, we can do it without machine learning by using simple approximation and the average time to resolve the issue. So the predicted amount of required time is the number of open issues multiplied by the average time to resolve one. For example, the median time to solve one issue is 2 hours, and we have 9 open issues, so the time required to solve them all is 18 hours (approximation). It's a good enough forecast, but we might know the speed of solving depends on the product, team, and other attributes of the issue. If we want to improve the prediction, we can utilize machine learning to solve this task.

The high-level approach looks the following:

Obtain the dataset for “learning”
Clean up the data
Prepare the "features" aka "feature engineering"
Train the model
Use the model to predict some value of the target dataset

For the first step, we will use a dataset of tickets for the last 30 weeks. Some parts here are simplified for illustrative purposes. In real life, the amount of data for learning should be big enough to make a useful model (e.g., in our case, we need thousands of issues to be analyzed).



issues = jira.search_issues(
    f'project = PPS AND status IN (Resolved) AND created >= -30w',
    maxResults=False,
    fields='summary,issuetype,customfield_10718,customfield_10674,aggregatetimespent',
)

closed_tickets = pd.DataFrame([{
    'key': issue.key,
    'team': issue.fields.customfield_10718,
    'product': issue.fields.customfield_10674,
    'time': issue.fields.aggregatetimespent,
} for issue in issues])

closed_tickets.set_index('key', inplace=True)
closed_tickets['time'].fillna(closed_tickets['time'].median(), inplace=True)

closed_tickets

In my case, it's something around 800 tickets and only two fields for "learning": "team" and "product."

The next step is to obtain our target dataset. Why do I do it so early? I want to clean up and do "feature engineering" in one shot for both datasets. Otherwise, the mismatch between the structures can cause problems.



issues = jira.search_issues(
    f'project = PPS AND status IN (Open, Reopened)',
    maxResults=False,
    fields='summary,issuetype,customfield_10718,customfield_10674',
)

open_tickets = pd.DataFrame([{
    'key': issue.key,
    'team': issue.fields.customfield_10718,
    'product': issue.fields.customfield_10674,
} for issue in issues])

open_tickets.set_index('key', inplace=True)

open_tickets

Please notice we have no "time" column here because we want to predict it. Let's nullify it and combine both datasets to prepare the "features."



open_tickets['time'] = 0
tickets = pd.concat([closed_tickets, open_tickets])

tickets

Columns "team" and "product" contain string values. One of the ways of dealing with that is to transform each value into separate fields with boolean flags.



products = pd.get_dummies(tickets['product'], prefix='product')
tickets = pd.concat([tickets, products], axis=1)
tickets.drop('product', axis=1, inplace=True)

teams = pd.get_dummies(tickets['team'], prefix='team')
tickets = pd.concat([tickets, teams], axis=1)
tickets.drop('team', axis=1, inplace=True)

tickets

The result may look like the following:

After the combined dataset preparation, we can split it back into two parts:



closed_tickets = tickets[:len(closed_tickets)]
open_tickets = tickets[len(closed_tickets):][:]

Now it's time to train our model:



from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeRegressor

features = closed_tickets.drop(['time'], axis=1)
labels = closed_tickets['time']

features_train, features_val, labels_train, labels_val = train_test_split(features, labels, test_size=0.2)

model = DecisionTreeRegressor()
model.fit(features_train, labels_train)
model.score(features_val, labels_val)

And the final step is to use our model to make a prediction:



open_tickets['time'] = model.predict(open_tickets.drop('time', axis=1, errors='ignore'))
open_tickets['time'].sum() / 3600

The final output, in my case, is 25 hours, which is higher than our initial rough estimation. This was a basic example. However, by using ML tools, you can significantly expand your abilities to analyze JIRA data.

Conclusion

Sometimes, JIRA built-in tools and plugins are not sufficient for effective analysis. Moreover, many 3rd party plugins are rather expensive, costing thousands of dollars per year, and you will still struggle to make them work the way you want. However, you can easily utilize well-known data analysis tools by fetching necessary information via JIRA API and go beyond these limitations. I spent so many hours playing with various JIRA plugins in attempts to create good reports for projects, but they often missed some important parts. Building a tool or a full-featured service on top of JIRA API also often looks like overkill. That's why typical data analysis and ML tools like Jupiter, pandas, matplotlib, scikit-learn, and others may work better here.

Build Binary Tree from Array

Alexey Yuzhakov — Wed, 31 Jan 2024 17:06:57 +0000

Intro

If you are interested in algorithms, data structures, and building efficient solutions or just preparing for the coding interview, you are aware of LeetCode and similar websites. Here, I will talk about a data structure called Binary Tree and the ways to build it using the array representation. LeetCode has dozens of such problems to practice with this data structure.

Problem

One of the ways to store the tree is to use an array representation. It’s hard for humans to analyze it, but it can be compact and convenient for machine processing. LeetCode is not an exception, so the typical problem describes the input as an array, converts it to the tree representation, and expects you to provide a solution. But if you want to experiment with the algorithm in your favorite IDE, like PyCharm, instead of a built-in LeetCode editor, here is a problem. The conversion of the array to the tree representation is hidden, and you need to implement it on your own.

The tree node is represented as follows:



class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

An example of the input is below:



[1, 2, 3, null, 4, null, null, 5, 6, null, 7]

Visual representation of such a tree looks like the following:

We need to create a function that accepts an array as a parameter and returns the root node of the created tree:



def build_tree(arr: list[int]) -> TreeNode | None:
    # TODO: implement

Approach #1

At first glance, it looks like the classic approach with 2i + 1 for the left child index and 2i + 2 for the right one should work well. Here is a visual representation of how to find the indexes for children nodes during the array traversal.

The recursive implementation may look like the following:



def build_tree(arr: list[int], i: int, n: int) -> TreeNode | None:
    root = None
    if i < n and arr[i] is not None:
        root = TreeNode(arr[i])
        root.left = build_tree(arr, 2 * i + 1, n)
        root.right = build_tree(arr, 2 * i + 2, n)
    return root

An example of the call:



arr = [1, 2, 3, None, 4, None, None, 5, 6, None, 7]
root = build_tree(arr, 0, len(arr))

I use None to make Python syntax valid and represent absent nodes. We need to notice our tree is not a “complete binary tree” or “binary heap” where all levels, except possibly the last, are completely filled, and nodes on the last level go from left to right. So it’s ok to have the missed nodes. And here is a gotcha! LeetCode uses the specific array representation. Take a look at the visualization once again:

The node with value 4 has two children: 5 and 6. Let’s enumerate our array with indexes:



[0: 1, 1: 2, 2: 3, 3: None, 4: 4, 5: None, 6: None, 7: 5, 8: 6, 9: None, 10: 7]

The node with value 4 has an index 4. Our formula tells us that the left child of it is calculated as follows: 2*4+1=9 and the right one: 2*4=10. But it’s None and 7 values instead of 4 and 5. The correct array representation to comply with the formula should look like the following:



[1, 2, 3, None, 4, None, None, None, None, 5, 6, None, None, None, None, None, None, None, None, None, 7, None, ...]

As you can see, there are a lot of None values, and the total number of elements is much higher. We are talking about the binary tree, and the tree in the example has 5 levels. So, the number of items to store all the possible values is 1+2+2*2+2*2*2+... = 31. Meanwhile, our tree has 7 values and uses only 11 array items to store them.

To validate the correctness of the tree, we can implement a simple string representation of the class (nesting represents the parent-child relationship):



class TreeNode:
    # ...

    def __repr__(self):
        def traverse(root: TreeNode, level: int) -> str:
            if not root:
                return ''
            prefix = '  ' * level
            return f'{prefix}({root.val})\n' + traverse(root.left, level + 1) + traverse(root.right, level + 1)
        return str.rstrip(traverse(self, 0))

Let’s print out our tree:

The created tree is incorrect. LeetCode uses a more compact representation, which detects children's node offsets using the other way.

Approach #2

To deal with missed values and compact representation, we can utilize the iterative approach and queues this time:



def build_tree(arr: list[int]) -> TreeNode | None:
    if len(arr) == 0:
        return None

    nodes = []

    val = arr.pop(0)
    root = TreeNode(val)
    nodes.append(root)

    while len(arr) > 0:
        curr = nodes.pop(0)

        left_val = arr.pop(0)
        if left_val is not None:
            curr.left = TreeNode(left_val)
            nodes.append(curr.left)

        if len(arr) > 0:
            right_val = arr.pop(0)
            if right_val is not None:
                curr.right = TreeNode(right_val)
                nodes.append(curr.right)

    return root

I use the list type for illustrative purposes, which can be replaced with queue implementation because we have a lot of pop(0) operations.

Let’s validate the correctness of the new method:

Everything is correct. The proposed solution works well on the LeetCode’s way of representing the trees as arrays.

Conclusion

The array representation usually stores "binary heaps" or "complete binary trees." But as you can see, some tweaks help to store any binary tree as an array in a rather compact form. But the re-building of the tree from such an array also becomes trickier, and you need to be careful with indexes for missed values.

The complete source code can be found here.

SSH as VPN Alternative

Alexey Yuzhakov — Sat, 04 Nov 2023 13:57:36 +0000

Internet Openness

During the last decades, the Internet openness principle has become something often ignored and violated. Suppose you travel a lot and want to access the resources located in one region while you are physically in another one. In that case, it is not a surprise anymore to find the resource is inaccessible. The reasons can be different. But one of the popular is that "we suffered from attacks from region X, so we decided to block the access for all the people/IPs from the region X," or even worse, "we decided to allow access only for people of our region based on IP."

VPN

I think VPN services became quite popular not only due to security reasons but also as a way to solve the described problem: provide access to a resource regardless of client IP-based limitations. There are a lot of VPN service providers across the globe. Surprisingly, the usage of VPN services can be less secure than it seems at first glance. Okay, you can buy a droplet in DigitalOcean and probably install OpenVPN or WireGuard. But at least it takes time for the initial configuration. If the need for such access is quite infrequent, all these efforts are not worth the time investment.

SSH Tunnel

There is some chance that you, like me, already have a virtual or physical server with SSH in the region to which you want access. For example, sitting in Sofia, Bulgaria, I want to check some websites hosted in Germany. Meanwhile, I have a DigitalOcean droplet located in Frankfurt, Germany, with SSH access. The SSH client is already in place on my machine. So, the only thing I need to do is establish the SSH tunnel and use a properly configured web browser for accessing these German websites.

The following command helps to establish the tunnel on 12345 port:



ssh -D 12345 my-droplet-in-frankfurt.com

The only difference between typical SSH command is the “-D” flag that instructs the SSH client to listen to the local 12345 port and forwards the traffic from our local machine to the remote server. So, we will access the desired websites "on behalf" of the remote machine.

My primary browser is Google Chrome. For alternative web browsing through SSH tunnel, I'm using Mozilla Firefox. To setup a proxy, one should go to Settings -> Network Settings and fill in the appropriate fields highlighted in the screenshot below:

SSH tunnel looks like a typical SSH session. So you can quit it as soon as you finish your web browsing of restricted websites. You also don't need to change your Firefox configuration every time you need to access different websites. Just establish the SSH tunnel to the new location, open Firefox, and start browsing.

Conclusion

SSH tunnel is an often overlooked alternative to the full-featured VPN services. But for a single person, occasional usage, the SSH tunnel can be a simpler and more convenient way of accessing restricted websites.

Build an Open Source Project: Behind the Scenes

Alexey Yuzhakov — Sun, 02 Jul 2023 18:11:04 +0000

Starting Point

One of my favorite hobbies is working on Open Source projects. Usually, it starts with solving my own problem. At some point in time, I can assume other people may find my experiments useful, and it's time to open-source the project. I see a lot of people think it’s enough to push the code to GitHub to open-source their project. Technically, yes, it's an essential step. Is it enough? Definitely, not.

If you decide to open-source your project, the first preparation steps can be found at https://opensource.guide/starting-a-project/ guideline. If you didn’t read it, I highly recommend doing that. It’s still not enough, but a good starting point.

Some time ago, I started a project called "xq", which is a command-line XML and HTML beautifier and content extractor written in Go. Using this project as an example, I want to show what I did to make it a little bit more discoverable and usable by other people.

Motivation

Well, you already pushed the source code of your project to GitHub. Why do you need to read next? Very soon, you may realize nobody is using your cool stuff that you were building by night last year. It can be disappointing.

The community around my projects motivates me to continue working on them and improving them, bringing some interesting ideas and even code contributions. It doesn’t matter if it's an Open Source project or a commercial one. It is always inspiring to see if your tool is used by hundreds, thousands, or millions of people. The more, the better. Meanwhile, as far as I always start such projects to solve my own problems, I benefit a lot from the growing amount of feedback.

Another thing that motivates me a lot is the ability to experiment with technologies and try new things. I have a regular job and work on the commercial product. It's always hard to bring something new and unreliable because commercial products are not playgrounds. So my open source initiatives always helped me to get some knowledge of technologies I can't use or experiment with on the main job.

Target Audience

It's quite important to analyze the potential target audience of your project in detail for the project's success. I often see how creators and maintainers expect the same level of skills from their project users. In most cases, it’s a misunderstanding that becomes a problem for both parties.

For example, for my "xq" utility which deals with XML in CLI and is written in Go, I expect the users to know what XML is for and how to use command-line utilities. But I don’t expect the knowledge of Go, the corresponding toolchain, or even any coding skills at all.

Usability

The next important thing is to think about your project as a product that is at least easy to install and start using. Ideally, the value of the product is absolutely clear for the end-user from the first seconds of your product usage.

We like to push our Open Source project to GitHub, but it's code-centric. We should help our users with clear and easy installation instructions. And here is a trap. I use Mac and want my "xq" utility installable using Homebrew:

brew install xq

I checked the guidelines on how to contribute to Homebrew and found that if your repo has zero "stars", you have zero chances to be added. One needs to find alternatives to start with. Ok, Go provides a facility for that:

go install github.com/sibprogrammer/xq@latest

At first glance, it looks good, but it limits the audience to people who already have the Go toolchain installed. It's quite a strict limitation. So in my case, a good starting point was the bash installer for the CLI utility:

curl -sSL https://bit.ly/install-xq | sudo bash

After gaining the first 50-70 "stars", we will be ready for new endeavors like brew or apt install.

But even before thinking about how to make the installation process easy, we need to attract a potential user and show the value of our product. I really like the projects which have a screenshot or animated video that demonstrates the main features right at the beginning of the README.md file. If we are talking about CLI utilities, the comprehensive set of usage examples is also an essential part. If it’s some kind of web product, a link to preinstalled demo server is a must. Life is short, and we all are very busy. You probably have no sales team to convenience the potential users, and people may evaluate your product not longer than just 1-2 minutes.

Marketing

Well, for the majority of us, the word "marketing" has no association with something good, interesting, and what we want to do. But it's not enough to have a good product, we need to tell other people about it somehow.

It is not so hard to start with friends and local communities. Twitter, Reddit, Facebook, LinkedIn, and other social networks may help you to gain the first feedback and attract the first users. I want to tell you the story about "xq" marketing efforts to show how it works in real life.

The company I work for has an initiative called "Research Days," and there is a special event named "Research Days Demos." So, the first presentation I did was at this event for my colleagues. There were not so many people, but some of them liked the utility. I also made a short post in the internal Slack channel related to similar initiatives.

The next two attempts were to tell about the utility on Reddit. One was successful (in terms that there was a discussion and the project attracted a few more users), and another one was blocked by the moderator (and I still have no clue why). Eventually, I got enough "stars" to be able to prepare a pull request to join Homebrew.

Later there was a period of slow organic growth, fixing of bugs, and implementing new features. I researched the ways how to make the installation on Linux more simple, and the assistance of distro maintainers helped a lot with adoption.

At some point in time, I decided to try the power of Hacker News, and it was so impressive. Tons of feedback and feature requests alongside the increasing number of "stars." I'm not a very active Twitter user and have a few followers, but after the post to Hacker News, I found there are a lot of bots on Twitter trying to repost every piece of news. There were even a couple of discussions after such reposts.

There are some more ideas, like joining "awesome" lists or publishing a dedicated article with a detailed feature set description and use cases. In general, I think it's a good idea, after several months of development, to spend some efforts on marketing to improve the product's discoverability.

Save Your Time

Working on a project can easily become boring due to different chores. Sorting out badly written bug reports, manual testing of every change and etc. may become a main reason for dissatisfaction. Because you are working on an Open Source project, spend your time and do it for free. One of the ways to keep yourself motivated is to be focused on answering the question of how to avoid chores and save time.

GitHub allows providing custom templates for new issue creation. Do not underestimate. It significantly helps to streamline the reports and forces the people to answer the questions necessary to the maintainer.

The last thing I want to deal with in the scope of my Open Source projects is the regression bugs. That's why a comprehensive set of tests is a significant time saver and brings the joy of working on the project improvements. Without tests, the maintenance of more or less complex projects after 5-6 release cycles will easily become a nightmare. It's interesting to see how often this truth is ignored.

Writing a set of tests is not enough. GitHub Actions is an excellent facility to organize the CI process not only for pushes to the master branch but for the pull requests as well. Otherwise, it is quite disappointing to find out after the merge of someone's pull request that code style or even tests are broken. It's not so hard to setup the actions. If you check your favorite Open Source project, I guess all of them will have established CI.

With "xq", I went even further and automated the release process using GoReleaser. To publish a new release, the only thing I need is to create and push the Git tag. The corresponding GitHub Action will trigger a release process, and GoReleaser prepares the binaries and changelog based on declared conventions. The result has a high level of predictability, and no manual work is required.

Perks

Open Source projects are usually not about making money. At least in a direct way. If you want to create a product and are not sure how to monetize it, starting this product by open-sourcing it can be not a good idea at all from my point of view.

But there are a lot of other exciting benefits. One of them I already mentioned briefly in the "Motivation" section: learning by doing and working with cutting-edge technologies can be easier in the scope of Open Source projects. Many times after such experiments, I re-used the knowledge in the commercial products as well.

If we a talking about a career, I believe almost every developer with 10+ years of experience should be able to show some code he or she wrote. NDA is not an excuse. Almost every modern software highly depends on Open Source components. So it should be just a matter of time and experience to contribute the fix or improvement to some project. I could be completely wrong, but as a man who did hundreds of tech interviews during the last two decades, it was much easier to build the candidate profile if I could check the code he or she wrote. So while working on Open Source projects, you are definitely investing in the building of your public profile as a developer.

Usually, you can't build a product without using various tools. Some of them can be free, and some of them can be commercial. The great benefit of working on Open Source projects is that a lot of companies with commercial products have special offers for non-commercial development. In the case of the "xq" utility, which is written in Go, I use GoLand IDE by JetBrains. I paid for it for several months but later tried to apply to their Open Source Program. They provided me with a license not only for GoLand but for the whole product pack! Another example is the CodeCov service. I want to track code coverage in an easy way to control the quality and ensure all major scenarios are covered by tests. CodeCov is quite expensive for commercial products (especially the hosted version, which costs ~50K USD), but it's absolutely free for Open Source projects, and this is awesome! If you need project hosting (e.g., for demo purposes), you may try to apply to the DigitalOcean Open Source initiative, and there are other alternatives available. This is just a few examples.

Connecting the Dots

To summarize the article, here are some takeaways I hope you may find helpful.

If you want to build an Open Source project, think about it in terms of the product, not the project, even if we are talking about the library. It should be easy to install and start using. The value for the end-user should be obvious from the first minutes of usage.

Without marketing, people have almost zero chances to know about your cool project. It is surprisingly not so hard to start. And it's connected with your long-term motivation in more ways than you can initially think about.

Your Open Source project should not become a tedious job. One of the ways to achieve that is to automate all routine work and enforce the policies. In the end, the release should be a joy, and reading user feature requests and bug reports should not make you cry due to the lack of essential details.

Last but not least, working on Open Source projects is rewarding. Sometimes it's not so obvious, but if we are talking about the long-term perspective, it definitely is.