First page Back Continue Last page Image

IClassifier

class IClassifier(Interface):

"""Provides method to classify documents by analyzing

their content."""

def getClassificationItemList(file, sample_url):

"""Returns a list key, value pairs representing the

classification for the document. May return an exception

if the learner is not ready or requires too much time

to initialize.

sample_url –- a URL to a text input which defines

the input sample to learn from, eventually

in the form of a URL to a web service

"""


Notes:

The IClassifier interface provides a classification of text content, image content, etc. by returning a list o key value pairs. This list defines implicit metadat which is analyzed from the file content.

A list of sample data which defines a learning sample for machine learning software may be provided as an option. This URL represents a file which can be downloaded. This file may either contain sample data or connection information to connect to a Web Service and retrieve sample data.

It is up to the classifier to implement persistence so that sample data only needs to be downloaded from time time. In between, the learning model should remain unchanged and persistent in RAM or on the filesystem.