Jump to content

Optimal Classification/Rypka Method

From Wikibooks, open books for an open world

Rypka's Method

[edit | edit source]

Rypka's[1] method[2] utilizes the theoretical and empirical separatory equations shown below to perform the task of optimal classification. The method finds the optimal order of the fewest attributes, which in combination define a bounded class of elements.

Application of the method begins with construction of an attribute-valued system in truth table[3] or spreadsheet form with elements listed in the left most column beginning in the second row. Characteristics[4] are listed in the first row beginning in the second column with the title of the attributes in the upper left most cell. Normally the file name of the data is given the title of the element class. The values which connect each characteristic with each element are placed in the intersecting cells. Selecting characteristics which all elements share may be the most difficult part of creating a database which can utilizing this method.

The elements are first sorted in descending order according to their truth table value, which is calculated from the existing sequence and value of characteristics for each element. Duplicate truth table values or multisets for the entire bounded class reveal either the need to eliminate duplicate elements or the need to include additional characteristics.

An empirical separatory value is calculated for each characteristic in the set and the characteristic with the greatest empirical separatory value is exchanged with the characteristic which occupies the most significant attribute position.

Next the second most significant characteristic is found by calculating an empirical separatory value for each remaining characteristic in combination with the first characteristic. The characteristic which produces the greatest separatory value is then exchanged with the characteristic which occupies the second most significant attribute position.

Next the third most significant characteristic is found by calculating an empirical separatory value for each remaining characteristic in combination with the first and second characteristics. The characteristic which produces the greatest empirical separatory value is then exchanged with the characteristic which occupies the third most significant attribute position. This procedure may continue until all characteristics have been processed or until one hundred percent separation of the elements has been achieved.

A larger radix will allow faster identification by excluding a greater percentage of elements per characteristic. A binary radix for instance excludes only fifty percent of the elements per characteristic whereas a five-valued radix excludes eighty percent of the elements per characteristic.[5] What follows is an elucidation of the matrix and separatory equations.[6]

  1. Truth Table Size-Related Equations
  2. Separatory Equations
    1. Element-Related equations
    2. Characteristic-related equations
      1. Theoretical separation
      2. Empirical separation
        1. Target Set Truth Table Values
        2. Separation Stages

Examples

[edit | edit source]
  1. Computational Example
  2. Application Example

Notes and References

[edit | edit source]
  1. :Eugene Weston Rypka passed away on April 27, 2006. Gene was born on May 6, 1925 in Owatonna, MN to Charles Frederick and Ethel Marie Rypka. He served in World War II as a paramedic in Iwo Jima and received several medals and commendations. In 1958, Gene received a Ph.D. in Medical Microbiology from Stanford University. He had a long and distinguished career, including work with Russian scientists at Lovelace Medical Center and the University of New Mexico. Bicycle racing was a lifetime love and occupation, and in later years, he also studied martial arts.
  2. Primary reference: Biological Identification with Computers edited by R.J. Pankhurst, British museum (natural history) London, England proceedings of a meeting held at Kings College, Cambridge 27 and 28 September 1973 of the Systematics Association Special Volume Number 7 and published by the Academic Press 1975 noting the work of Eugene W. Rypka, Dept. of Microbiology, Lovelace Center for Health Sciences, Albuquerque, New Mexico, "Pattern Recognition and Microbial Identification." ISBN 0125448503
  3. Characteristics and attributes are used interchangeably.
  4. See Table II. page 158 of the primary reference
  5. The primary reference should be consulted for a more detailed and in depth explanation of the method.