Simple TRIE implementation

Posted on February 8th, 2011 by by Sameek

Its been a long time since my last post. But I finally have some free time at least till march begins. Here is a simple implementation of the TRIE Data structure. Its not the compressed TRIE, but the real simple one (using arrays instead of lists). I have done some very simple tests on it. I am more curious to know what is the memory charge for this implementation on a really large DB of words maybe 3,00,000 or so. My next target will be to implement a compressed version of it and see the difference.

Meanwhile feel free to use this copy.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
/*
 * Trie.h
 *
 *  Created on: Feb 8, 2011
 *      Author: sameekbanerjee
 */
 
#ifndef TRIE_H_
#define TRIE_H_
 
struct trienode
{
	char value;
	trienode* children[26];
	bool end;
	trienode();
	trienode(char value_);
	virtual ~trienode();
};
/*
 * Really inefficient Implementation of Trie
 * But easy to use and understand
 */
class Trie
{
public:
	Trie();
	virtual ~Trie();
	void addword(const char* word);
	bool search(const char* word);
private:
	bool validate(const char* word);
	trienode root;
};
 
#endif /* TRIE_H_ */

The implementation follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
/*
 * Trie.cpp
 *
 *  Created on: Feb 8, 2011
 *      Author: sameekbanerjee
 */
 
#include "Trie.h"
#include <cstdlib>
#include <cstdio>
#include <cstring>
#include <cctype>
#include <cassert>
const unsigned int MAXLEN = 12;
/*
 * Implementation of trienode
 */
trienode::~trienode()
{
	for(int i=0; i<26;i++)
		delete children[i];
}
 
trienode::trienode():value(0),end(false)
{
	std::memset(children,0,sizeof(trienode*) * 26);
}
 
trienode::trienode(char value_):value(value_),end(false)
{
	std::memset(children,0,sizeof(trienode*) * 26);
}
 
/*
 * Implementation of TRIE
 */
Trie::Trie()
{
}
 
Trie::~Trie()
{
}
/*
 * Takes a word and adds it to the dictionary
 */
void Trie::addword(const char *word)
{
	if(!validate(word))
		return;
	const char *p=word;
	trienode *current=&root;
	while(*p)
	{
		char c=tolower(*p);
		unsigned int pos=c-'a';
		assert(pos<26);
		if(current->children[pos]==0)
		{
			current->children[pos]=new trienode(c);
		}
		current=current->children[pos];
		p++;
	}
	current->end=true;
}
/*
 * Take a word and searches in the dictionary
 * Returns true if it exists false otherwise
 */
bool Trie::search(const char *word)
{
	if(!validate(word))
		return false;
	bool flag=true;
	const char *p=word;
	trienode *current=&root;
	while(*p)
	{
		char c=tolower(*p);
		unsigned int pos=c-'a';
		assert (pos<26);
		if(current->children[pos]==0)
		{
			flag = false;
			break;
		}
		current=current->children[pos];
		p++;
	}
	if(flag)
	{
		if(current->end)
			return true;
	}
	return false;
}
 
bool Trie::validate(const char *word)
{
	if(strlen(word)>MAXLEN)
	{
		std::fprintf(stderr,"Maximum length of Word exceeds %d\n",MAXLEN);
		return false;
	}
	const char *p=word;
	while(*p)
	{
		if(!isalpha(*p))
		{
			std::fprintf(stderr,"Words should only contain alphabets\n");
			return false;
		}
		p++;
	}
	return true;
}

Or Download the project here. TRIE

A word of caution: This is only tested on GNU C++ compiler for mac. This should compile on all other GNU versions, but may require minor changes if you try to compile it on Visual studio.

Also I tried the new wp-syntax plug-in. It is supposed to be cool but I don’t see any syntax highlighting, only line numbers. It may be clashing with my theme. I have to dig in to it.

Tags:

Leave a Reply