FAQ

Frequently Asked Questions

Sections

General

Q: What is LipiTk and who is it meant for ?

A: Lipi Toolkit is a collection of algorithms and resources for online Handwriting Recognition (HWR). The adjective "online" here implies that the handwriting is captured and represented as a stream of (x,y) points using a digitizer ... not as a raster image (which is referred to as "offline").

The toolkit has different components intended to support different types of users, with different requirements, such as application developers, technology enthusiasts, and researchers. See the About page for details.

Q: What is in the toolkit ?

A: The Lipi Toolkit website actually offers a collection of different kinds of components and resources, all related to digital ink and handwriting recognition. Most (but not all) have source code along with binaries. The major categories are listed below.

Lipi Toolkit - which in turn contains the:
- Core Toolkit, a collection of algorithms and scripts for creating your own handwritten shape recognizers
- Lipi Designer, a Java based application which provides a GUI for creating shape recognizers interactively.
- Alphanumeric Character Recognizer, a shape recognizer for English uppercase characters, lowercase characters and numerals that can be readily integrated into client applications.
Standalone tools for online handwriting data collection.
Shape recognizers built using the Lipi Toolkit for other scripts and character sets.

Q: What can you use LipiTk for ?

A: LipiTk gives you the capability to integrate handwriting recognition of isolated characters or shapes into your applications. It does this using a variety of different components targeted at different types of users, e.g.

Ready-to-use recognizers for common character sets that you can integrate into your own applications. For example, demonumerals and the Alphanumeric Character Recognizer are part of the Lipi Toolkit download, and additional Lipi Recognizers are available for download.
Lipi Designer allows you to create a new recognizer for a custom set of shapes with just a few samples (this invisibly integrates the steps of data collection and training). You can then integrate these recognizers into your own applications.
Data collection tools for collecting large numbers of samples of a set of characters/shapes, based on TabletPC and other devices. You can then use the data to train one of the shape recognition methods provided in the Core Toolkit.
Implementations of shape features and shape recognition methods in the Core Toolkit, which you can tweak or replace with your own algorithms.

Note that "online handwriting input" need not be from a stylus per se; as long as you have a stream of (x,y) coordinates with stroke begin and end events, you can use Lipi Toolkit for recognition. For instance, you can use it for recognizing gestures made with a finger on touch surface, or by a hand in front of a depth sensor.

Q: Can I use LipiTk components for my research or commercial application ? Do I have to return my source code changes ?

A: Lipi Toolkit components (source code as well as binaries) are licensed under the MIT license, which places no restrictions on the type of use.

Q: It appears that LipiTk only supports the recognition of isolated characters and shapes. What about words and sentences ?

A: This is correct. Lipi Toolkit is mainly meant for isolated characters, gestures and symbols.

It is possible to use LipiTk's isolated character recognition together with an exhaustive search or Dynamic Programming for recognizing "discrete" writing (where there is a pen up between characters, and the temporal order of symbols is for the most part constant ).

Cursive writing is a different ball game. Several approaches ranging from Hidden Markov Models to Graph matching have been tried for different types of scripts (Latin, Oriental, etc), typically with a fixed "dictionary" of words. Some researchers have used Lipi Toolkit for preprocessing and feature extraction at the stroke or character level as part of their word recognition technique.

The recognition of higher level units such as phrases and sentences typically involves the use of language models in the form of word n-grams.

If you are a researcher working on these problems and would like to contribute to Lipi Toolkit, do let us know.

Can I use LipiTk components for my research or commercial application ? Do I have to return my source code changes ?

A: Lipi Toolkit components (source code as well as binaries) are licensed under the MIT license, which places no restrictions on the type of use.

Q: How do I know what is in the pipeline ? How can I contribute my feedback and suggestions ?

A: Please use the forums on SourceForge for your feedback and suggestions. There is also a feedback survey available from the home page at this time.

Q: Does LipiTk represent the state of the art in recognition technology ?

A: LipiTk uses well known and simple algorithms such as Nearest Neighbor classification using Dynamic Time Warping, and will give you good results as long it has it has been trained on sufficient data and configured correctly. However there are many advanced algorithms in the literature that may give better results. The purpose of Lipi Toolkit is to provide recognition technology primarily for character sets (and platforms) where commercial technology does not exist or is too expensive.

If you find the recognition accuracy from the available shape recognition methods unsatisfactory for your application, you are of course free to try your own algorithms with the toolkit, or use just the components you need.

Q: What kind of support can I expect ?

A: LipiTk is currently maintained by a small set of volunteers at HP, and as such, we do not have the means to support users. We do try and make detailed documentation and release notes available, and then there are the forums. You can also contact us via email, we will do what we can !

Q: How can I get involved ?

A: For starters, you can use different components of the toolkit and provide us with feedback. We can use help with reviewing the documentation for errors, creating user-friendly tutorials.

If you are creating recognizers for your language/script, we would love to include them as part of the read-to-use Lipi Recognizers.

You are also welcome to submit your own (sufficiently well-tested !) algorithms and demos.

We would also like pointers to applications you have created using LipiTk components, and other resources (such as datasets) which may be useful to the community.

Accepted contributions will be acknowledged on the About page.

Using the Toolkit

Q: I am looking for a recognizer for common character sets such as numerals, English uppercase letters, etc. What component should I use ?

A: The Alphanumeric Character Recognizer is already included with the Lipi Toolkit download. Look through the downloadable Lipi Recognizers - perhaps there is one you can use. If not, and if you have access to a dataset of handwriting samples for that character set, you can train one of the shape recognition methods from the Core Toolkit using the dataset and create your own recognizer.

Q: I want to create a recognizer for a set of shapes that I have defined. How do I go about this without having to get too deep into the code ?

A: If you want to quickly build a recognizer for a custom set of shapes, try the Lipi Designer tool. This allows you to provide a few samples of each shape and builds a recognizer for you (using the DTW Shape Recognition Method internally). This is especially convenient when you are actively adding new shapes or modifying old ones, as in a gesture-based application. You will still need to integrate the recognizer into your application though - there is sample code available that illustrates how to do that.

If you have a fixed set of shapes (such as a character set for a script) and are looking for high accuracy, you may get better results by formally collecting data from a large number (say 100) users using one of the Data Collection Tools, and then training one of the Shape Recognition methods in the toolkit.

Q: I want to experiment with a different set of preprocessing methods/features/classification algorithms. How can I use the toolkit ?

A: If you want to try a different set of preprocessing algorithms or features, you can easily integrate it into the toolkit and train and test with the available recognition algorithms. If you want to try a different recognition technique or toolbox, you can either implement that technique for Lipi Toolkit, or use Lipi Toolkit only to extract features to files, and do the recognition using your own implementation or toolbox.