Why is "string" not being changed when I use += in python?

PDS OWNER CALIN (Calin Baenen) on June 17, 2019

I am working on a new project ya-ta ya-ta... And it has been a while since I have worked with python, while most of it is easy, in my code why is t... [Read Full]
markdown guide
 

I'd recommend trying to print the value of the string variable right after the variable is supposed to be updated. When I did this, it appeared that string was added to, but the same character was added over and over. This is because of the while state == 1 loop, which just continuously adds the character without moving on to the next character. To change this, remove the while loop and move some code into an if statement, like this:

if state == 1:
    string += char
else:
    tok += char

However, this would only add the first character of the string and then start adding to tok. This is since the state would immediately be changed back. All that needs to be done is removing the " or ' from tok after it is detected. Such can be done just like this:

if tok == "\"" or tok == "'":
    tok = tok[:-1] # remove the last character from tok

However, the final " needs to be removed, so an if statement can be added at the very top of the for loop to fix this.

if state == 1 and (char == "\"" or char == "'"):
    continue

After these few changes, the string variable should work a bit better.

 

Still not working... :\
Did I do it wrong?
Code:

for char in command:
        if state == 1 and (char == "\"" or char == "'"):
            continue
        if state == 1:
            string += char
        else:
            tok += char
        if tok == "\"" or tok == "'":
            tok = tok[:-1]
            if state == 0:
                state = 1
            elif state == 1:
                state = 0
        elif tok == "out" and state == 0:
            commandToRun = "print"
 

So it looks like I've forgotten about a change that needs to be made. Currently, the code won't figure out how to deal with a command before the string. In addition, if you only want to state to be 1 in a string and 0 outside a string, I would recommend changing it to be a boolean so operations such as switching it can be made easier. Changing to a boolean only requires changing the state = 0 outside the for loop to state = False. However, it requires a few more changes inside the for loop.

In the first if statement, state == 1 can just become state, so the if statement ends up like this:

if state and (char == "\"" or char == "'"):
    state = False  # if we're in a string and see a quote, set the state to not be in a string
    continue  # continue to the next character

Additionally, the next if statement can just become if state: instead of if state == 1:.

Now that change I forgot about can be implemented. To know whether to enter a string, only the current character needs to be looked at, not the entire token. Additionally, since the end of strings was already handled in the first if statement, it can be assumed that a seeing a " or ' at this point in the loop means a string is starting. This changes that last if statement to look like this:

if char == "\"" or char == "'":
    tok = tok[:-1]  # get rid of the " or ' added to tok
    state = True  # set the state to be in a string

After doing these changes, I get the following output ($ represents the bash prompt):

$ python3 string.py
out "Hello, world!"
Hello, world!

One last closing recommendation is that if you really want to get involved in creating programming languages, I would recommend reading the book Language Implementation Patterns by Terence Parr. It's a good book that goes through how to make a lexer and multiple types of parsers. Unfortunately, the example code is written in Java and the book uses the author's parser generator, ANTLR. That being said, it is still a good look into a variety of parsing techniques and I would recommend it.

 

I found the answer in my latest post, please look at that for the new question I have...

 

After looking at that post, it seems that implementing the changes I mentioned here should alleviate the problem. In particular, the change mentioned in the third code block is what'll fix the string issue. I've put it below for reference:

if char == "\"" or char == "'":
    tok = tok[:-1]  # get rid of the " or ' added to tok
    state = True  # set the state to be in a string
code of conduct - report abuse