Veeranki Phani Sirisha

Posted on Dec 31, 2024

Data Structures and Algorithms on Strings

#datastructures #algorithms #suffixtrees

Day 1: String Algorithms Overview

The first day of the ACM Winter School on DSA Strings began with an insightful lecture by Dr. Venkatesh Raman on essential string processing algorithms, including:

Asymptotic Analysis: Importance of Big-O notation and time/space complexity.
KMP Algorithm: Efficient pattern matching using the prefix function.
Boyer-Moore Algorithm: Focused on the bad character and good suffix rules.
Pattern Matching Techniques: Efficient substring search.
LCS & Edit Distance: Dynamic programming solutions for LCS and applications of edit distance in fields like bioinformatics.

Sriram Bhyravarapu led a hands-on tutorial where participants implemented the KMP and Boyer-Moore algorithms, practiced pattern matching, and explored dynamic programming with LCS and edit distance.

Day 2: Advanced String Algorithms

The second day focused on advanced topics such as wildcard matching, mismatches, and suffix trees:

Wildcard Matching: Algorithms for handling * and ? characters in patterns.
Pattern Matching with Mismatches: Techniques for approximate pattern matching.
Suffix Trees: Construction and applications in substring search, longest repeated substring, and lexicographical analysis.

Sriram Bhyravarapu guided the hands-on implementation of wildcard matching, approximate pattern matching, and suffix trees, giving participants practical experience with these
advanced techniques.

Day 3: Suffix Arrays and Their Applications

Dr. Sharma Thankachan's lecture focused on the fundamentals of suffix arrays, including their construction (Manber-Myers and Kasai's LCP algorithm) and applications in pattern matching, data compression, and bioinformatics.

Sriram Bhyravarapu guided participants through hands-on implementation of suffix arrays, LCP arrays, and applications like substring searches, longest repeated substring identification, and Burrows-Wheeler Transform (BWT) for compression tasks.

Day 4: Advanced String Data Structures and Queries

Dr. Thankachan discussed advanced applications of suffix trees and arrays in text indexing, bioinformatics, and data compression. He also introduced K-th lexicographical substring queries and Range Minimum Queries (RMQ).

Sriram led coding exercises on K-th queries, RMQs, and their integration with suffix structures. Participants worked on real-world problems involving substring ranking, pattern matching, and optimization.

The combination of theory and practical coding exercises provided participants with a solid understanding of advanced string algorithms for solving complex real-world problems.

Day 5: Rank-Select Operations, Wavelet Trees, and Applications

Lecture by Dr. Chirag Jain:
Introduced rank (count of occurrences) and select (position of K-th occurrence) operations.
Explained wavelet trees for efficient sequence representation and supporting queries.
Discussed real-world applications in text compression, pattern matching, and bioinformatics.

Tutorial by Sriram Bhyravarapu:
Hands-on implementation of rank-select operations and wavelet tree construction.
Solved practical problems like substring frequency and range queries.

Key Takeaways:
Gained practical skills in rank-select operations and wavelet trees for computational problems.

Day 6: Burrows-Wheeler Transform (BWT), FM-Index, and Applications

Lecture by Dr. Chirag Jain:
Introduced BWT for text compression and FM-Index for efficient substring matching.
Discussed applications in text indexing and genomic data analysis.

Tutorial by Sriram Bhyravarapu:
Implemented BWT and FM-Index, with exercises in pattern matching and text compression.

Key Takeaways:
Mastered BWT and FM-Index, essential for text processing and genomic analysis.

Day 7: Locality-Sensitive Hashing (LSH), Bloom Filters, and Applications

Lecture by Dr. Naveen Sivadasan:
Introduced LSH for approximate nearest neighbor searches and Bloom Filters for efficient set membership testing.
Discussed applications in image similarity, web indexing, and network security.

Tutorial by Sriram Bhyravarapu:
Implemented LSH and Bloom Filters, with exercises on query optimization and false positive rates.

Key Takeaways:
Learned advanced techniques for efficient data retrieval and set operations using LSH and Bloom Filters.

Day 8: Applications and Closing Ceremony

Session by Dr. Naveen Sivadasan:
Explored real-world applications of LSH and Bloom Filters in data deduplication, genomic data analysis, and recommendation systems.
Highlighted their role in bridging theoretical knowledge with practical industry challenges.

Closing Ceremony:
Reflected on the week-long program, emphasizing the advanced concepts, hands-on skills, and real-world applications covered.
Celebrated participants’ achievements and growth in tackling complex computational problems.

Key Takeaway:
The final day reinforced the practical relevance of advanced algorithms and marked a fitting conclusion to an intensive and enriching learning journey.

DEV Community

Data Structures and Algorithms on Strings

Top comments (0)

Read next

Item 81: Prefira os utilitários de concorrência ao wait e notify

How I Built My First Portfolio Website as a Beginner

Gerando Arquivos Excel(XLSX) a Partir do Banco de Dados com Node.js

A Guide to Starting Your Public Speaking Journey