DEV Community: Md Abdul Hasib

FAANG Interview Question| Longest Valid Parentheses

Md Abdul Hasib — Fri, 22 Apr 2022 08:56:51 +0000

Given a string containing just the characters '(' and ')', find the length of the longest valid (well-formed) parentheses substring.

Example 1:

Input: s = "(()"
Output: 2
Explanation: The longest valid parentheses substring is "()".

Example 2:

Input: s = ")()())"
Output: 4
Explanation: The longest valid parentheses substring is "()()".

Example 3:

Input: s = ""
Output: 0

Constraints:

0 <= s.length <= 3 * 104
s[i] is '(', or ')'.

Solution 1:

# O(n) time | O(n) space - where n is the length of the input string
def longestBalancedSubstring(string):
    stack = [-1]
    longest = 0
    for idx in range(len(string)):
        ch = string[idx]
        if ch == "(":
            stack.append(idx)
        else:
            stack.pop()
            if len(stack) == 0:
                stack.append(idx)
            else:
                top = stack[-1]
                longest = max(longest, idx - top)


    return longest

Better Solution

# O(n) time | O(1) space - where n is the length of the input string
def longestBalancedSubstring(string):
    return max(
            get_longest_sub_string_count(string, True),
            get_longest_sub_string_count(string, False)
        )

def get_longest_sub_string_count(string, left_to_right):
    open_ch= "(" if left_to_right else ")"
    step = 1 if left_to_right else -1
    idx = 0 if left_to_right else len(string) - 1
    open_count = 0
    close_count = 0
    longest = 0
    while idx > -1 and idx < len(string):
        ch = string[idx]
        if ch == open_ch:
            open_count +=1
        else:
            close_count += 1

        if close_count == open_count:
            longest = max(longest, open_count*2)
        elif close_count > open_count:
            open_count = 0
            close_count = 0

        idx += step


    return longest

Populate data-frame faster- [from 4 hours to 15 second]

Md Abdul Hasib — Thu, 16 Sep 2021 04:03:51 +0000

Hello Guys, So the main problem was on this link.

To give you a high-level overview. I had to populate a data_frame with about 1300 columns and 1 million rows.

What I was doing first is

prepare the data frame with zeros for all columns and rows.
Then, iterating through each column and row and
assign the value for that cell after some dynamic processing. Here is the code.

# main part, not complete code

def populate_data_frame_in_prediction_time(data, columns):
    unknown_col = "nan"
    columns_set = set(columns)
    result_data_frame = pd.DataFrame(0, index=np.arange(len(data)), columns=columns)

    for prefix in data.columns: # O(m)
        unknown_column_name = str(prefix) + "_" + str(unknown_col)
        for index, row in data.iterrows(): #O(n)
            value = row[prefix]
            result_column_name = str(prefix) + "_" + str(value)
            if result_column_name not in columns_set: # O(1)
                result_column_name = unknown_column_name

            result_data_frame[result_column_name][index] = 1

    result_data_frame = result_data_frame.astype('uint8')
    return result_data_frame

It was a prolonged process. It took about 4+ hours for my case of scenery.
Then I have done slightly better but not the best way.

So what I have in this intermediate approach is

Did not start work with all columns Pick a column and make a data frame for it as I converted categorical value to one-hot encoding.
Then, iterating through each column and row with only that smaller data-frame and
assign the value for that cell after some dynamic processing. Repeat the procedure for the next column of the raw data.

By doing this, the processing time improves to *20 minutes from 4 hours. * here is code for each column(series) in the raw data

def _custom_one_hot_encoding_1d(series, column_list):
    unknown_col = "nan"
    prefix = series.name

    number_of_rows, number_of_col = series.shape[0], len(column_list)

    dummy_data = np.array([np.zeros(number_of_rows, dtype=int)] * number_of_col).T
    df = pd.DataFrame(dummy_data, columns=column_list)
    for index, name in series.items():
        if not name:
            name = unknown_col

        column_name = str(prefix) + "_" + str(name)
        if column_name not in df:
            column_name = prefix + "_" + unknown_col
        df[column_name][index] = 1
    return df

Then to improve more, what I did is

previously, I prepared the data frame first. then populated it.
I reversed the way.
I prepared the data first with array[ can be python list / NumPy array) To do that, I had to write some custom code. It was pandas data-frame earlier.
after preparing the data, I fill up the data frame with the full data.

It improved the performance drastically. Now it is taking only about 15 seconds to do that processing. Here is the code.

def _custom_one_hot_encoding_1d(series, column_list):
    unknown_col = "nan"
    prefix = series.name

    number_of_rows, number_of_col = series.shape[0], len(column_list)

    column_idx = {column_list[i]: i for i in range(len(column_list))}
    result_arr = np.array([np.zeros(number_of_rows, dtype=int)] * number_of_col).T

    for index, name in series.items():
        if not name:
            name = unknown_col

        column_name = str(prefix) + "_" + str(name)
        if column_name not in column_list:
            column_name = prefix + "_" + unknown_col
        result_arr[index][column_idx[column_name]] = 1

    df = pd.DataFrame(result_arr, columns=column_list)
    return df

This is my story. Thank you guys for reading. If you guys have any better idea, please suggest here. I will be happy to try that in my code.

How to Publish and Use Your Python Package in and from Private Bitbucket or Github

Md Abdul Hasib — Thu, 04 Mar 2021 10:14:29 +0000

Python has become one of the most popular programming languages. One major reason is that we, regular Python users, are free to share our code and others can use it very conveniently. A formal way of such code sharing is to pack all your code into a package and upload it to the Python Package Index (pypi.org), through which other Python users can install your package easily using the pip tool.

If you have published a Python package yourself, you should know that the process is not difficult. However, for those who have never done it, you may have mistakenly thought that it must be a painful process.

In this article, I’ll show you the steps to preparing your package and publishing it.
But I will do it by uploading to bitbucket, not in PyPI.org. Because If you are doing it for a company they may not want to upload it for public use. So let's get into it.

Apparently, the first step is to complete your project. However, we all understand that packages are never perfect. As a result, you don’t have to wait for the completion of your project before you can figure out how to publish your package. The setup file
As mentioned previously, we use the setup.py file to indicate how our package should be prepared. The following code shows you what a typical setup.py file looks like:

For the purpose of the current tutorial, let’s suppose that the directory that has all your Python files (e.g. file_1.py, as shown later) is called you_package. This is the one that you want to publish. To keep things simple, the file file_1.py has just the following code, which will be used when we’re ready to import the package after its publication:

def hello_world():
    print("Hello World!")

1. Prepare Files

Before we dive into the details for publishing, let’s first structure the directory with the necessary support files:

the_root_dir/
|-- README.md
|-- setup.py
|-- your_package
|--|-- __init__.py
|--|-- file_0.py
|--|-- file_1.py
|--|-- file_2.py

At the root directory, at the same level as the your_package, create the README.md file. This file will cover things that you want the users to know about your package, such as installation and use instructions.
Another file at the same level is the setup.py file. This file will contain the information necessary to set up the package for publishing. We’ll talk about what’s inside this file in a minute.
Within the your_package directory, create the init.py file. This file has to be named such, and what it does is convert a regular directory to a Python package. For the current simple package, we can just put the code in the file_1.py for exporting

from you_package.file_0 import hello_world

The README file

Theoretically, you can use just plain text for this file, but I prefer to use a markdown file that gives you formatting options to make it more interesting to read.
Please note that the gist is saved as a TXT file, which shows you the source text for the markdown before bitbucket automatically applies the formatting for markdown files.
Usually, it’s recommended that you include the license information about the package so that the users can know the terms.

The setup file

As mentioned previously, we use the 'setup.py' file to indicate how our package should be prepared. The following code shows you what a typical 'setup.py' file looks like:

import setuptools

with open("README.md") as file:
    read_me_description = file.read()

setuptools.setup(
    name="your-package-username",
    version="0.1",
    author="Your Name",
    author_email="your_email",
    description="This is a test package.",
    long_description=read_me_description,
    long_description_content_type="text/markdown",
    url="package_bitbucket_page",
    packages=['your_package'],
    classifiers=[
        "Programming Language :: Python :: 3",
        "License :: OSI Approved :: MIT License",
        "Operating System :: OS Independent",
    ],
    python_requires='>=3.7',
)

We first import the setup tools module, which has convenient methods for packaging setup.
In the setup() function call, most parameters are very straightforward and should be self-explanatory. Some highlights are shown below:

name: The distribution name of the package, which has to be unique on pypi.org. To ensure uniqueness, it’s always a good idea that you append your PyPI username to your package name if you want to upload it in Pypi also.

long_description: The long description is usually set from the README.md file. To show the description properly, you specify that it is a markdown file by setting the long_description_content_type parameter.

packages: The list of packages that you want to publish in the distribution package. Because we’re only publishing just one package (your_package), its name is shown in a list. However, if you want to publish all packages in the directory, you can use setup tools.find_packages() to retrieve them conveniently.

python_requires: Specifies the version of Python that your package requires.

When you use it from another codebase pip will take the version name you mentioned in the setup.py file.

Now I am assuming you have committed and pushed this code into Bitbucket. Now it is time to use it from another codebase.

from the root directory, you want to use this library
pip install git+https://user_name@bitbucket.org/company_name/your_project.git#egg=name_in_setup_file_you_wrote

or if you want to download a specific tagged version
pip install git+https://user_name@bitbucket.org/company_name/your_project.git@1.0.7#egg=name_in_setup_file_you_wrote

It will download the version you have in the setup.py file, actually the latest version.

If you want to download a specific version tagged in 'git tag', you can tag that commit using git and update that version in the setup.py file also. Because pip does not recognize git tag. Pip download that Commit you have tagged and treat the version written in setup.py as the library version.

Conclusion

In this article, we learned the major steps that we use to create a Python package for distribution through bitbucket.