So how do you get a sub-string in Python? Well, Python has a handy dandy feature called "slicing" that can be used for getting sub-strings from strings.

But first, we have to go over a couple of things to understand how this works.

Slicing Python Objects

Strings, in Python, are arrays of characters, except they act a little bit different than arrays. However, you can treat them, for the most part, as arrays.

Using this information, we can use Python's array functionality, called "slicing", on our strings! Slicing is a general piece of functionality that can be applied to any array-type object in Python.

Okay, so here's a concrete example using a simple array to start off.

As you can see, this gives us a subset of the array up to the 3rd element. Slicing takes in two "arguments" that specify the start and end position you would like in your array.

Syntax: array[start:end]

So in our example above, if we only wanted the elements 2 and 3, we would do the following:

Alright, alright. What does this have to do with sub-strings? Excellent question!

Getting Sub-strings: Slicing Python Strings!

Above, I mentioned that we can pretty much treat strings like arrays. That means that we can apply the same logic to our strings!

Here's an example:

Wow! We accessed the character just like it was an element in an array! Awesome!

So what we see here is a "sub-string". To get a sub-string from a string, it's as simple as inputting the desired start position of the string as well as the desired end position.

Of course, we can also omit either the start position, or the end position which will tell Python that we would like to either start our sub-string from the start, or end it at the end, respectively.

Here's another example using our string above.

Alright, that's all well and good, but what the heck did I do in that last line? I didn't specify the start or the end, so how come it still worked?

Well, what we did there was tell Python to get the start element all the way to the end element. It's completely valid. It also creates a new copy of the string or array. You could use this if you need a copy of the original to modify!

Reverse Sub-string Slicing in Python

Another example, using extended slicing, can get the sub-string in reverse order. This is just a "for fun" example, but if you ever need to reverse a string in Python, or get the reversed sub-string of a string, this could definitely help.

Here we go:

Awesome!

It's out of the scope of this article to explain extended slicing, so I won't say too much about it. The only thing different is that extra colon at the end and the number after it. The extra colon tells Python that this is an extended slice, and the "-1" is the index to use when traversing the string. If we had put a "1" where the "-1" is, we'd get the same result as before.

There you have it! It's really easy to get sub-strings in Python, and I hope I educated you more in Python-fu.

Hasta luego!

About The Author

  • David

    How do I get the next whitespace delimited substring after a known substring within a string.
    in meta code it might look like this:

    str.get-next-right-token-after-pattern(pattern, delimiter)
    ?

    I can get the to beginning of the next token with
    str[str.find(pattern) + len(pattern) : ??],
    but I can’t figure out how to derive the token end-point. with a regex? all attempts complain that I need an integer.

    I’m a python newbie, and I can only find methods that give me substrings within known string positions.

    • http://jacksonc.com Jackson Cooper

      What’s an example?

      • David

        I think I figured it out.
        consider:
        str is a large string
        pat is a pattern of characters to be searched for in str.
        tok is the sough-after substring of chars (token), whitespace delimited, that follows pat in str

        1. find the beginning of the pat in str using str.find(pat)
        pos – str.find(pat)

        2. make a new string from the end of pat to the end of str
        tok = str[pos + len(pat): len(str)]

        3. a little cleanup — strip off any left-side white space.
        tok = tok.strip()

        4. the token I want is the 1st substring of characters in tok up to the first whitespace:
        tok = tok.split(‘ ‘)[0]

        DS