Dictionary tree search

From GiderosMobile

https://forum.gideros.rocks/discussion/821/compressed-ternary-dags-for-dictionary-search

Compressed ternary DAGs for dictionary search

atilim, one of Gideros founder, implemented an optimized version of the Ternary Search Trees algorithm for Gideros!

  • It's fast (Samsung SII, the search speed is about 30000-50000 searches per second)
  • It's compact and GC friendly
  • After compression the dictionary file is usually smaller than the original file

Creating Dictionaries

You can create your dictionaries by using the code buildtst.cpp or use buildtst.exe with the cmd line. You can download the zip file:

Media:Dictionary_tree_search_cpp.zip

Steps to convert a .txt file to a compressed .dic file suitable for fast search:

  • go in the folder where "buildtst.exe" is
  • put your .txt file in that same folder
  • in windows explorer navbar type cmd
  • then in the cmd editor enter:
buildtst.exe yourfile.txt yourfile.dic

That's it, your text file has been converted to a dic file.

Gideros Search Algorithm

-- bg
application:setBackgroundColor(0xaa5500)

local f = io.open("sowpods.dic", "rb")
local dic = f:read("*all")
f:close()

local function search(dic, str)
	local pos = 1
	local node = 1

	while node ~= 0 do
		local char = str:byte(pos) or 0
		local index = (node - 1) * 10
		local splitchar = dic:byte(index + 1)
		
		if char < splitchar then
			local b1, b2, b3 = dic:byte(index + 2, index + 4)
			node = b1 + b2 * 256 + b3 * 65536
		elseif char == splitchar then
			if char == 0 then
				return true
			end
			pos = pos + 1
			local b1, b2, b3 = dic:byte(index + 5, index + 7)
			node = b1 + b2 * 256 + b3 * 65536
		else
			local b1, b2, b3 = dic:byte(index + 8, index + 10)
			node = b1 + b2 * 256 + b3 * 65536
		end
	end
	return false
end

print(search(dic, "ZYMOSIMETER"))	--> should be true
print(search(dic, "GIDEROS"))		--> should be false
print(search(dic, "LANGUAGE"))		--> should be true
print(search(dic, "SYSTEM"))		--> should be true


The Full Project

Media:Dictionary_tree_search_algo.zip