LeetCode 28. Find the Index of the First Occurrence in a String

Post by ailswan Aug. 09, 2024

28. Find the Index of the First Occurrence in a String

Problem Statement

link: LeetCode.cn LeetCode Given two strings needle and haystack, return the index of the first occurrence of needle in haystack, or -1 if needle is not part of haystack.

Example:

Input: haystack = "sadbutsad", needle = "sad" Output: 0

Input: haystack = "leetcode", needle = "leeto" Output: -1

Constraints: 1 <= haystack.length, needle.length <= 104 haystack and needle consist of only lowercase English characters.

Solution Approach

Brute Force Approach: Iterate through each possible starting position in the haystack and compare substrings of length equal to needle to find a match.

KMP Algorithm (Solution 1): Utilize the Knuth-Morris-Pratt (KMP) algorithm’s partial match table to efficiently find the first occurrence of needle in haystack by skipping redundant comparisons.

KMP Algorithm (Solution 2): Employ an optimized KMP algorithm that uses the partial match table to avoid unnecessary re-evaluations, ensuring linear time complexity for matching needle in haystack.

Algorithm

  1. Brute Force Approach Iterate through Start Positions: Start at each index in the haystack and check if a substring of length equal to needle matches needle.

Substring Comparison: For each start position, compare characters of haystack with needle to determine if they are equal.

Return Index or -1: If a match is found, return the starting index of the substring; otherwise, continue searching or return -1 if no match is found.

  1. KMP Algorithm (Solution 1) Compute Partial Match Table: Preprocess the needle to create a table (next array) that holds the length of the longest prefix which is also a suffix for each position in needle.

Pattern Matching with Skips: Use the partial match table to skip over previously matched characters in needle when a mismatch occurs, thereby reducing redundant comparisons.

Return Index or -1: Traverse haystack while using the partial match table to efficiently find and return the starting index of needle or -1 if needle is not found.

  1. KMP Algorithm (Solution 2) Compute Partial Match Table: Create the next array for needle, which helps track the longest proper prefix that is also a suffix for efficient pattern matching.

Simultaneous Traversal: Traverse both haystack and needle simultaneously, using the next array to skip characters in needle when mismatches occur and avoid unnecessary comparisons.

Return Index or -1: Determine if a full match is found by comparing the lengths of haystack and needle, returning the starting index of the first occurrence or -1 if no match is found.

Implement

    class Solution:
    def strStr(self, haystack: str, needle: str) -> int:
        for i in range(len(haystack) - len(needle) + 1):
            if haystack[i] == needle[0]:
                j = 1
                while j < len(needle):
                    if haystack[i + j] != needle[j]:
                        break
                    j += 1

                if j == len(needle):
                    return i
        return -1
    
#kmp solution1 class Solution:
    def strStr(self, haystack: str, needle: str) -> int:
        a=len(needle)
        b=len(haystack)
        if a==0:
            return 0
        next=self.getnext(a,needle)
        p=-1
        for j in range(b):
            while p>=0 and needle[p+1]!=haystack[j]:
                p=next[p]
            if needle[p+1]==haystack[j]:
                p+=1
            if p==a-1:
                return j-a+1
        return -1

    def getnext(self,a,needle):
        next=['' for i in range(a)]
        k=-1
        next[0]=k
        for i in range(1,len(needle)):
            while (k>-1 and needle[k+1]!=needle[i]):
                k=next[k]
            if needle[k+1]==needle[i]:
                k+=1
            next[i]=k
        return next

#kmp solution2 class Solution:
    def strStr(self, haystack: str, needle: str) -> int:
        a=len(needle)
        b=len(haystack)
        if a==0:
            return 0
        i=j=0
        next=self.getnext(a,needle)
        while(i<b and j<a):
            if j==-1 or needle[j]==haystack[i]:
                i+=1
                j+=1
            else:
                j=next[j]
        if j==a:
            return i-j
        else:
            return -1

    def getnext(self,a,needle):
        next=['' for i in range(a)]
        j,k=0,-1
        next[0]=k
        while(j<a-1):
            if k==-1 or needle[k]==needle[j]:
                k+=1
                j+=1
                next[j]=k
            else:
                k=next[k]
        return next