Skip to Content

Longest substring with at most K distinct characters

Home | Coding Interviews | Searching and Sorting | Longest substring with at most K distinct characters

Given a string s and an integer k, return the length of the longest substring of s that contains at most k distinct characters.

class Solution:
    def lengthOfLongestSubstringKDistinct(self, s: str, k: int) -> int:
        cnt = Counter()
        n = len(s)
        ans = j = 0
        for i, c in enumerate(s):
            cnt[c] += 1
            while len(cnt) > k:
                cnt[s[j]] -= 1
                if cnt[s[j]] == 0:
                    cnt.pop(s[j])
                j += 1
            ans = max(ans, i - j + 1)
        return ans

##Another solution:
def longest_substring_k_distinct(s, k):
    char_count = {}
    max_length = 0
    left = 0

    for right in range(len(s)):
        char_count[s[right]] = char_count.get(s[right], 0) + 1

        while len(char_count) > k:
            char_count[s[left]] -= 1
            if char_count[s[left]] == 0:
                del char_count[s[left]]
            left += 1

        max_length = max(max_length, right - left + 1)

    return max_length

# Test the function
s = "abcba"
k = 2
result = longest_substring_k_distinct(s, k)
print(result)  # Output: 3
class Solution {
    public int lengthOfLongestSubstringKDistinct(String s, int k) {
        Map<Character, Integer> cnt = new HashMap<>();
        int n = s.length();
        int ans = 0, j = 0;
        for (int i = 0; i < n; ++i) {
            char c = s.charAt(i);
            cnt.put(c, cnt.getOrDefault(c, 0) + 1);
            while (cnt.size() > k) {
                char t = s.charAt(j);
                cnt.put(t, cnt.getOrDefault(t, 0) - 1);
                if (cnt.get(t) == 0) {
                    cnt.remove(t);
                }
                ++j;
            }
            ans = Math.max(ans, i - j + 1);
        }
        return ans;
    }
}

The Longest Substring with At Most K Distinct Characters problem can be efficiently solved using a sliding window approach. You maintain a window defined by two pointers, expanding the window by moving the right pointer to include new characters while keeping track of their frequencies in a hash map. When the window contains more than K distinct characters, you shrink it by moving the left pointer until the constraint is satisfied again. Throughout this process, you continuously update the maximum length of the window whenever it contains no more than K distinct characters.

This method ensures that each character is processed at most twice (once when added and once when removed), resulting in an O(n) time complexity. The key to solving this problem lies in managing the window size dynamically and efficiently updating character frequencies to ensure the window always adheres to the K distinct characters constraint. This technique is powerful and widely applicable in problems involving substrings and character frequencies.

Posted by grwgreg 2 months ago

Related Problems

Given an array of distinct integers candidates and a target integer target, return a list of all unique combinations of candidates where the chosen numbers sum to target. You may return the combinations in any order.

The same number may be chosen from candidates an unlimited number of times. Two combinations are unique if the frequency of at least one of the chosen numbers is different.

There are a total of numCourses courses you have to take, labeled from 0 to numCourses - 1. You are given an array prerequisites where prerequisites[i] = [ai, bi] indicates that you must take course bi first if you want to take course ai.

For example, the pair [0, 1], indicates that to take course 0 you have to first take course 1.

Return true if you can finish all courses. Otherwise, return false.

The n-queens puzzle is the problem of placing n queens on an n x n chessboard such that no two queens attack each other.

Given an integer n, return all distinct solutions to the n-queens puzzle. You may return the answer in any order.

Each solution contains a distinct board configuration of the n-queens' placement, where 'Q' and '.' both indicate a queen and an empty space, respectively.

Given two sorted arrays nums1 and nums2 of size m and n respectively, return the median of the two sorted arrays.

The overall run time complexity should be O(log (m+n)).