An introduction to SysOps and DevOps

The role of a SysOps engineer and the kind of work they perform can vary from company to company. For the context of this piece, we will use the title of SysOps Engineer to describe the role of a person that is hired by a company to develop and maintain the infrastructure that is required for software products to run in a production environment.

A person in a SysOps role is responsible for a system and keep all of its various elements running smoothly. This can include anything from a single server, to multiple servers, databases and any third-party dependencies that a system relies upon to operate.

There are many variations to the title, System Administrators, SysAdmins, Ops, SysOps, Operations, and more recently System Reliability engineers (SRE), a title created at Google.

Whichever title you decide to choose in your career, the aim of this content is to give you an introduction to the terms and practices of a SysOps engineer as well as some practical experience with several tools and technologies to help you start your SysOps career.

What is DevOps?

DevOps means a cross department cooperation between Developers and SysOps engineers. It is a cultural approach within a company intended to drive efficiency and reduce problems when developing and deploying software to production.

The term DevOps is intended to describe an approach to work. For developers to develop code with the complete system in mind as much as possible or to work closely with the SysOps team during each phase of ongoing development.

By working together, software can be developed quickly and safely.

Before DevOps, software development often followed a Waterfall method of development. This approach broke development up into logical stages such as gathering requirements, planning, designing, developing, and testing including security, deployment, and maintenance.

This process was rigid and slow. The stages were linear and it wasn’t possible to move on to the next stage until the previous stage was completed. Changes to the plan once it was in progress were either blocked or slowed down the project. It isn’t always possible to know all of the requirements at the beginning of the process and there was no going back, resulting in failed or dropped projects.

Agile methods became popular to change this approach. To create a minimum working software product quickly to satisfy customer’s needs and to iterate again and again in short development cycles to add new features or to fix any bugs.

The danger of not taking a DevOps approach, where developers and SysOps stick to their own particular areas of work, or silos, can result in a poorly developed product or service.

If not adopting a DevOps approach to work, developer and SysOps teams may end up working in different directions or at different speeds without considering each other’s work.
Some examples of working in silos:

A developer might work on some code without thinking of things that can happen in a production environment. Events such as DNS issues, or a database failover that don’t happen when developing locally can happen in production and a developers code may not be ready to handle it in a graceful manner.

A development team might deploy some great new features into production for its users. These new updates might include new database queries being performed, putting additional pressure on a busy database. This could impact negatively on performance for users and could have been avoided if communicated in advance with the team responsible for looking after the database infrastructure.

A SysOps team working in isolation without communicating with development teams may work to improve a system and make changes to the infrastructures in ways that the developers aren’t expecting. As a simple example, upgrading a database to the latest security patched version, while any existing apps are expecting the previous database version. This is the kind of scenario that can arise when teams don’t work together for a common goal.

For example, this can result in putting the database under pressure, large process lists, increasing CPU or memory, and begin setting off alarms for the SysOps team to react to.

Your customers might be impacted by this, they may just have gained access to new features in your software and suddenly it has stopped working or performance has dropped significantly.

The solution to it might be an easy one, to scale horizontally by increasing your available database replicas or to upgrade the database itself to handle the work, though those options are all reactive and you’re likely doing it at peak traffic which isn’t an ideal time to find out about it.

A development team and a SysOps team working closely can prevent these kinds of issues from occurring by being aware of each other’s work, communicating often, and keeping the customers in mind.

Becoming a SysOps Engineer

A SysOps engineer can often start their careers as a software developer and grow into a SysOps role out of necessity.

A modern website or web application is more than just software. When you’re working alone, or as part of a small team, you may be the only person that can develop the software as well as any items it needs, such as file storage, the ability to send emails, or a database to store user data.

Creating a database, or managing space for users to upload files may be outside of your comfort zone as a developer though these items are necessary for your software to work well and be useful for its users.

The aim of this content is to guide you on how to create and manage items like this. You may feel overwhelmed in the beginning, or even along the way, as there is a lot to learn, though try to take one thing at a time. If something is failing, try to become a mini expert in it. Search for information on it, back it up, take notes and try to fix it one change at a time.

Some tasks you will need to perform as a SysOps engineer can be basic, such as freeing up some disk space mentioned earlier on a server that is running out of space. Try to automate tasks like this, or better still prevent it from happening.

A developer might turn to Google, blog posts or books on how to add these items to their software as the need arises.

The server you are working on could be a Linux-based or a Windows-based machine, so over time you’ll need to become familiar with some basic commands on each system.
In later chapters, we’ll learn about some AWS specific ways to solve common problems such as creating servers, load balancing, databases, and backups.

Things you’ll need to know as a SysOps engineer:

  • Being a developer – While it isn’t essential to be a Developer before moving to a SysOps position, any development experience you already have will certainly help. As a result of being a developer, you may already be familiar with many of the tools and concepts such as git, testing, compiling, deploying changes, and more, which we will explore in more detail later in this site. If you don’t have a developer background, a good place to start would be to have a side project of your own, where you learn a new software language and build a small application start to finish
  • Command line – Command line usage is common in a SysOps role. You’ll need to be familiar with the different command lines offered by different systems. Linux-based systems often use a command line called Bash, while Windows systems use a Dos-based command prompt or PowerShell. Using the command line can be a real time saver compared to performing any actions manually in your file manager for example. Command line actions can also be shared with coworkers or added to scripts to automate several tasks.
  • SSH Public and Private keys – SSH keys comprise of two files. The first is a public key file and the second is a private key file to complete the pair. SSH keys are a great way to connect to remote servers to deploy your code. You’ll always keep your private key to yourself and the public key can be shared with a remote server to authenticate your users. Once you have your keys in place, connecting to a remote server is as easy as ssh username@remotehost which connects you to your remote server. Many services such as Git and Deployment tools use keys which we’ll cover in more detail later.
  • Databases – There are many database options available out there, from file-based SQLite to multi-master relational databases such as mySQL and noSQL databases such as MongoDB.
    • You don’t need to know of every kind of database. The database(s) you use will likely be dictated by the software you or your organization are developing. If you are already familiar with one kind of database such as mySQL, it would be worth the time to look into other databases to know their strengths and weaknesses and to know how to add/remove users, set user permissions, and perform day-to-day actions such as creating/altering tables/collections.
    • A brief introduction to different types of common databases:
      • Relational databases such as mySQL or MS SQL are ideal for structured data such as user details, online shopping product details, blog posts, and most other common application uses.
      • NoSQL databases such as MongoDB or AWS DynamoDB are ideal for unstructured data such as JSON data. NoSQL databases can be ideal if you are taking in third-party data that you don’t control or data that may change its structure over time without warning.
      • In-memory databases such as Redis are fast key/value databases. They can help improve the performance of an app by keeping frequently accessed data readily available rather than your software connecting to a potentially busy database as often.
      • Graph databases such as Neo4J are ideal for storing data where relationships matter. Ideal for showing social network relationships or complex data networks.
      • Time series databases (TSDBs) are ideal for storing data where the sequence is important, such as application logs.
      • Multi-model databases are becoming more common. Database offerings such as ArangoDB or OrientDB can support two or more different data models in one, which can be ideal if the app you are developing needs more than one type of data storage.
  • Version Control – Using a version control system such as Git is a way to store the source code of your software. Changes are committed to a repository and other members of the team can access and modify the same files where permitted.
    • Even if you are the only developer on a project, using a version control system is still a good idea to use to get familiar with some branching models for when you work in a team later on. Some useful terminology when using Git:
      • A commit is the process of adding some changes to the local copy of your repository.
      • A push is the process of pushing one or more commits from your local machine to the central repository – so your coworkers can see the work for example.
      • A pull is the process of pulling any changes from the main repository to your local machine.
      • A branch is a way to take a copy of the source code where you can work on making changes or bug fixes and when you are ready to submit your work after one or more commits, you can push your work to the branch. This branch becomes visible to team members to contribute to or comment on while the original codebase remains unchanged.
      • A PR is a Pull Request. This is a process where you can request that your branch is merged into the main code so that your work becomes part of the main application.
      • A fork is similar to a branch, however a branch often only exists for a short time before being merged or being dropped. A fork will usually mean that a developer is taking a full copy of the source code for their own use, to make significant additional changes that they may never want to merge back into the main source code.
    • Popular version control services include Github, Gitlab, and Bitbucket. These services are an ideal starting point for deploying software changes to production, which we will cover in more detail later.
  • Web servers – If you are developing a web application, you’ll want to put it online somewhere so that other people can use it. You have a number of options for running your application. You can self-host it on a physical server you own, or use a cloud-based virtual server. Or you can package your app to run within a container or use a serverless approach. Later in the series, we’ll give an intro to each approach with practical examples so that you can choose an approach that works for you and the software you develop.
    • Using any approach will give you a great experience in developing, deploying, and managing your web application.
  • Deployments – A deployment is the process of taking completed code and sending it to one or more destination servers for users to use. There are many different types of deployments, from single file changes, to more detailed blue/green deployments. Blue/Green deployments are where new servers are created on demand, user traffic is diverted to the new servers when ready and the old servers are removed afterwards.
    • The specific deployment process you use will depend on the software you are developing and the deployment destination, such as virtual servers, containers or serverless functions. The following is a list of common deployment tools:
      • FTP/SFTP – This is the process of sending files, often one at a time to a destination server. FTP stands for File Transfer Protocol and commonly uses port 21 to communicate with the destination server. SFTP is Secure File Transfer protocol and often communicates with port 22 on the destination server. FTP is an older protocol and it is generally not recommended to use as there are some newer more secure approaches available with better features.
      • Rsync – Rsync stands for Remote Sync. It is a tool for syncing files from one server to another. It is useful as it will only send the files that have changed locally to the destination server. It can also maintain any preexisting file permissions, compress files, and show progress as it syncs files.
      • SCP – CP short for Copy, is a common command line facility to copy files and folders locally. SCP is Security Copy that can copy files to a remote server.
      • Git Pull – This is the process of ‘pulling’ files inward, from a remote git repository on to the current server. Some local processes on the server will need to trigger the pull step. This process isn’t generally recommended as it requires a copy of Git to be installed locally, along with a .git folder which can often contain sensitive information if the server is compromised.
      • AWS API/CLI – The AWS CLI is a command line interface to many AWS services. The AWS API or CLI can be used to send files to AWS S3, a popular choice for static frontend websites or applications that talk to an independently deployed backend API.

Exploring other forms of Ops

DevSecOps

In the same way that DevOps aims to bring Development + Operations closer together to improve the development process, DevSecOps aims to bring security to be an integral part of all stages in the development life cycle.

Instead of leaving Security measures to the end of the development, or omitting security measures entirely, security is to be kept in mind at each stage of development, from writing code, to compiling binaries, to deployment and ongoing maintenance.

While writing code, developers can add security-related tests such as SQL injection attempts, or CSRF attempts in to their unit tests.

While building a server, an Ops team can choose to harden their base server or container images using Center for Internet Security (CIS) or other standards. Hardening a base image in this way to remove common vulnerabilities can be automated using tools such as Ansible or Chef, which we cover later in this series.

When deployed to production, the Ops team could continue to scan and monitor the applications.

AWS provides a number of security-related services including AWS Guard Duty, a service which can perform ongoing scans of your infrastructure for malicious activity.

GitOps

GitOps is a concept introduced by a company called WeaveWork. GitOps is similar to the concept of Infrastructure as Code ( IaC). With IaC, your infrastructure is created and managed using software such as Terraform or CloudFormation. This code can be treated just like any other application code and added to the source code repository such as Github. The same workflows often employed in software development, such as branches, PRs and merges can be applied to IaC resulting in the desired changes being deployed to update your infrastructure in place.

Whereas IaC is aimed at the entire infrastructure, GitOps tends to be aimed at container orchestration tools such as Kubernetes. In a Kubernetes cluster, software can be deployed and managed using a range of tools starting with KubeCTL.

Deployments can be performed independently of one another and it isn’t always clear at a high level what applications have been deployed at a given time and what version of those applications are currently running within the cluster.

With a GitOps approach, information on the deployments that should be running within the cluster are written in a declarative format such as a YAML file and committed to the repository. When new software needs to be deployed to the cluster, the files in the repository are updated which triggers a background process to bring the cluster up to date to match the desired outcome.
If additional applications are deployed to the cluster independently of the gitops approach, such as deploying directly using KubeCTL, then the cluster will automatically be reverted back to the version described within the repository.

This approach gives a very clear summary of the applications and versions currently deployed to the cluster. If one or more applications need to be deployed to the cluster, a developer can update the repository and deploy a new app or roll back to an earlier version using a PR. This approach can also facilitate rebuilding a damaged cluster, or deploying a copy of the same applications to a different cluster, used for testing or staging for example.

ChatOps

ChatOps is the name given to the process of bringing development related conversations, actions, and workflows into a Chat application used at your organization.

Also known as conversation-driven development, the title may have originated with Github, they have a number of ChatOps related videos on YouTube that are worth watching.

There are a number of reasons why a company might like to use a ChatOps approach. A team can work together developing a software product and keep any conversations in one place, visible to the entire team, instead of spread out over emails or other sources.

ChatOps also works well for any teams with remote team members, or those in different time zones. Team members can sign in when they begin their day and see any conversations, actions or alerts that may have occurred while they were offline.
While performing any day-to-day development work, events such as commits or deployments can be sent in to the same chat rooms so team members can see the ongoing development. Any important events such as a failed deployment can be seen straight away for the team to respond to.

By performing actions in a shared chat room, your team can share information easily as they work. New team members will be able to see the actions and perform the same steps on their own.

If you are using AWS, a service called Short Notification Service (SNS) is a great resource for receiving notifications for events that occur within AWS. Events such as an RDS instance reboot can send an event to SNS and in turn sends you a notification via Email, SMS, or Webook. Events can also be sent to AWS ChatBot, a service that can receive the event information, process it, and send it on to popular Chat apps such as Slack.

One item to be aware of when automating actions within a Chat application is to keep actions secure. If you have automated an action such as rebooting a server, you might want to limit that action to a specific room or set of users to avoid misuse.

NoOps

NoOps (No Operations) is the idea that things can be so automated that a company won’t need dedicated SysOps roles.

NoOps could be thought of as the end result of a successful move to a DevOps culture. This is one where developers can build, deploy, and monitor their software through automation and self-service with no need for a sysops team.

NoOps is a worthwhile goal. If you can automate things to a degree that makes your work or a developer’s work easier, you’ll be in a great place with a lot of experience under your belt.

Don’t be afraid of automation

Some recommended reading

  • “The Phoenix Project: A Novel about IT, DevOps, and Helping Your Business Win”, by Gene Kim, Kevin Behr and George Spafford
  • “The Unicorn Project: A Novel about Developers, Digital Disruption, and Thriving in the Age of Data”, by Gene Kim

Leave a Reply

Your email address will not be published. Required fields are marked *