Dictionary tree search
From GiderosMobile
https://forum.gideros.rocks/discussion/821/compressed-ternary-dags-for-dictionary-search
Compressed ternary DAGs for dictionary search
atilim, one of Gideros founder, implemented an optimized version of the Ternary Search Trees algorithm for Gideros!
- It's fast (Samsung SII, the search speed is about 30000-50000 searches per second)
- It's compact and GC friendly
- After compression the dictionary file is usually smaller than the original file
Creating Dictionaries
You can create your dictionaries by using the code buildtst.cpp or use buildtst.exe with the cmd line. You can download the zip file:
Media:Dictionary_tree_search_cpp.zip
Steps to convert a .txt file to a compressed .dic file suitable for fast search:
- go in the folder where "buildtst.exe" is
- put your .txt file in that same folder
- in windows explorer navbar type cmd
- then in the cmd editor enter:
buildtst.exe yourfile.txt yourfile.dic
That's it, your text file has been converted to a dic file.
Gideros Search Algorithm
-- bg
application:setBackgroundColor(0xaa5500)
local f = io.open("sowpods.dic", "rb")
local dic = f:read("*all")
f:close()
local function search(dic, str)
local pos = 1
local node = 1
while node ~= 0 do
local char = str:byte(pos) or 0
local index = (node - 1) * 10
local splitchar = dic:byte(index + 1)
if char < splitchar then
local b1, b2, b3 = dic:byte(index + 2, index + 4)
node = b1 + b2 * 256 + b3 * 65536
elseif char == splitchar then
if char == 0 then
return true
end
pos = pos + 1
local b1, b2, b3 = dic:byte(index + 5, index + 7)
node = b1 + b2 * 256 + b3 * 65536
else
local b1, b2, b3 = dic:byte(index + 8, index + 10)
node = b1 + b2 * 256 + b3 * 65536
end
end
return false
end
print(search(dic, "ZYMOSIMETER")) --> should be true
print(search(dic, "GIDEROS")) --> should be false
print(search(dic, "LANGUAGE")) --> should be true
print(search(dic, "SYSTEM")) --> should be true