I've started tackling coding problems to try and improve my skills.
I'm working on the 'Alien Dictionary' coding problem which, when given a sorted list of 'Alien Words' you need to determine the 'Alien Alphabet'. The alien alphabet is made up of Latin characters but in a different order than ours.
I've since learned there are more optimised ways of solving this which I will look into, but I want to see my instinctual approach through to completion.
My code does compile with c++20 and outputs the correct alphabet, however, I had to implement a 'hack' to cover an edge case which I explain in a code comment.
I can't quite wrap my head around why I needed the hack, or how to fix my code to not require it.
#include <iostream> // std::cout
#include <map> // std::map
#include <vector> // std::vector
#include <algorithm> // std::sort
/*
Input: words[] = ["pbb", "bpku", "bpkb", "kbp", "kbu"];
Expected Output: ['p', 'u', 'b', 'k'];
*/
typedef std::vector<std::string> WordList;
WordList alienWords = {"pbb", "bpku", "bpkb", "kbp", "kbu"};
typedef std::map<std::pair<char, char>, char> RuleBook;
RuleBook alienRuleBook;
typedef std::vector<char> Alphabet;
Alphabet alienAlphabet;
void populateAlphabet(Alphabet& alphabet, WordList& wordList) {
alphabet.clear();
for (int word = 0; word < wordList.size(); word++) {
for (int letter = 0; letter < wordList[word].size(); letter++) {
if(std::find(alphabet.begin(), alphabet.end(), wordList[word][letter]) == alphabet.end()) {
alphabet.push_back(wordList[word][letter]);
}
}
}
}
void generateRules(RuleBook& ruleBook, WordList& wordList){
for (int firstWord = 0; firstWord < wordList.size(); firstWord++) {
for (int secondWord = firstWord + 1; secondWord < wordList.size(); secondWord++) {
if (secondWord == wordList.size()) break;
int letter = 0;
for (; letter < wordList[firstWord].size(); letter++) {
if (wordList[firstWord][letter] == wordList[secondWord][letter]) continue;
ruleBook[{wordList[firstWord][letter], wordList[secondWord][letter]}] = '<';
ruleBook[{wordList[secondWord][letter], wordList[firstWord][letter]}] = '>';
break;
}
}
}
}
// needs to return TRUE if 'l' should come before 'r'.
bool getRule(char l, char r) {
switch(alienRuleBook[{l, r}]) {
case '>': return false;
case '<': return true;
}
std::cout << "ERROR! No rule found for: '" << l << "' vs '" << r << "'\n\n";
// The below is a hack because I don't understand to fix the case of {'u', 'k'}
// There's no 'discovered' rule saying 'u' comes before 'k' or 'k' comes after 'u'
// even though we KNOW 'u' comes before 'b' and we know that 'b' comes before 'k'.
return true;
}
void printAlphabet(Alphabet& alphabet){
std::cout << "=== Alphabet ===" << "\n ";
for(const auto it : alphabet)
std::cout << it << " ";
std::cout << "\n================\n\n";
}
void printRuleBook(RuleBook& ruleBook){
std::cout << "=== Discovered Rules ===" << "\n";
for(const auto it : ruleBook)
std::cout << " " << it.first.first << " " << it.second << " " << it.first.second << '\n';
std::cout << "================\n\n";
}
int main() {
populateAlphabet(alienAlphabet, alienWords);
generateRules(alienRuleBook, alienWords);
std::sort(alienAlphabet.begin(), alienAlphabet.end(), getRule);
printRuleBook(alienRuleBook);
printAlphabet(alienAlphabet);
return 0;
}
In order to implement the
getRulefunction if there is no implicit rule for{a, b}, you should search for{a, x} = '>'where{x, b} = '>'or{a, x} = '<'where{x, b} = '<'In case if you cannot find
{a, x} and {x, b}, you should search for{a, x} + {x, y} + {y, b}. The search depth can increase. I'm not sure that is good solution.I would suggest to look at the problem as a directed graph:
p->b->kandp->u->band (duplicated)p->up). That will be the first char of the alphabetp. That will give you the second char (u).uand find the node that has no other incoming connections exceptpandu. That will give youb