Regex Named Groups and Using Them in C#

While working on an issue in Codespaces, I figured this would be a good case to implement some regex. Each time I work with regex I need to figure out how it works again, but also each time I am impressed with how powerful it is. And actually, this time I learned something new: regex named groups. Regex has the ability to name each matched group which is very easy to subsequently use in C# code. In this post, I will tell you how to do exactly that.

Outline

The issue I was working on had to do with Azure DevOps URLs. Apparently, the Azure DevOps URL should be formatted like https://myaccount@dev.azure.com/myaccount/foo/bar in order to work properly while using git.

But as you're just remembering a URL, it's very easy to simply switch to https://dev.azure.com/myaccount/foo/bar. However, this will fail your git clone. Since the part right after the domain is the same thing we need in front with the @ sign, we could at least give it some attempt to unblock our user here.

First Attempt without Regex Named Groups

For a first attempt I simple started to split the string. Since you can probably use http and https, or even leave the protocol out, I would go look for dev.azure.com. Split that. Then find the next /, but maybe there is no next / indices, lengths, putting it together again... The code for this started to become very verbose for something seemingly simple. There had to be another way.

Enter Regex

Then I remembered regex! As mentioned before; each time I do something with regex I need to relearn it, but once you got the hang of it, that will go faster each time. At the same time, whenever I do some work with regex I remember how powerful it is! This time I would even discover a feature that makes it more powerful than I realized.

Pretty quickly I came up with a regex pattern like this: (http[s]?://)*(dev.azure.com/([a-zA-Z]*)(.*)) which will recognize all kinds of variations without having to account for them in code.

I won't go into all the things of regex, you can write a book about that. But some tools and references that can help you with this are https://regexr.com/ and http://regexstorm.net/. Note that there are some dialects in regex. For instance, when using regexr, you'll need to escape the forward slashes with a backslash (like http:\/\/).

The thing I will say is that each portion in brackets will show up in a separate group. But some groups are optional, so this presented a problem; I still needed to write code to see if all the necessary parts are there.

Regex Named Groups to the rescue

As it turns out, you can name each group. By simply putting ?<yourname> right after the opening bracket you can refer to the group by its name. The resulting expression would be something like this: (?<protocol>http[s]?://)*(?<domainandpath>dev.azure.com/(?<accountname>[a-zA-Z]*)(.*))

You can see that there is one nested group. Since I don't care what is after the account name, I put that in one big group that I can simply re-add in the new URL as a whole.

When we put this together in C# code it looks like below.

var azureDevOpsUrl = "https://dev.azure.com/jfversluis";
var resultUrl = "";
var azureDevOpsMatch = Regex.Match(azureDevOpsUrl, "(?<protocol>http[s]?://)(?<domainandpath>dev.azure.com/(?<accountname>[a-zA-Z]*)(.*))", RegexOptions.IgnoreCase);
if (azureDevOpsMatch.Success)
{
resultUrl = $"{azureDevOpsMatch.Groups["protocol"]}{azureDevOpsMatch.Groups["accountname"]}@{azureDevOpsMatch.Groups["domainandpath"]}";
}
Console.WriteLine(resultUrl);
view raw regexnamedgroups.cs hosted with ❤ by GitHub

We see if there is a match, including the group names. When there are, we rebuild the URL through these group names. The cool thing being we can just reference them by their string index in the C# collection 🤯.

In Closing

Pretty cool and powerful, right?

Another thing that helped my iterating over this pretty fast is try.dot.net. This little website lets you run C# code as if it were a console application. Just take the code from above (add using System.Text.RegularExpressions; at the top) and run it! You can quickly iterate over it to see if you get the right results and then put it in your codebase. It seems from the website that you will be able to embed these code runners in your browser at a later stage. #awesomesauce!

If you're looking for more development related tips and tricks, maybe you're also interested in the Git rebase post, or the Git empty commit. I also try to do some video, please subscribe to my YouTube channel.

2 thoughts on “Regex Named Groups and Using Them in C#”

  1. Damn dude! Don’t use console runners. Write tests like a real programmer

Comments are closed.