Symptom analysis of multidimensional categorical data with applications

Authors

  • N. P. Alexeyeva
  • F. S. Al-Juboori
  • E. P. Skurat

DOI:

https://doi.org/10.21533/pen.v8.i3.1191

Abstract

The linear combinations of dichotomous variables over the field , which are called symptoms, form the projective space from which it is possible select the more informative subspaces for reducing the dimensionality of binary data. In this article, the symptom space expands to the super-symptom space. The super-symptom means a linear combination of various multiplications of  dichotomous variables over a field of characteristic  without repeating. In algebra, such functions are called Zhegalkin polynomials or algebraic normal forms. It is known that each logical function can be represented in the form of a Zhegalkin polynomial in a unique way, therefore using them to iterate one can find a logical function to best describe a risk group. The search algorithm of a more informative super-symptom for classification is based on the superposition of impulse sequences with different types of operations: first multiplication and then addition over the  field. Also the super-symptom analysis is a convenient method for a study of the correlation between two sets of categorical variables. This method was applied to identify the most severe forms of the disease by combining hormonal, immunological and genetic tests in patients with breast cancer (data from Cancer Oncology Hospital in Medicine City in Baghdad) and to identify genetic risk factors by patients with alcohol dependence syndrome, receiving alcohol dependence therapy (St.Petersburg V.M. Bekhterev Psychoneurological Research Institute).

Downloads

Published

2020-08-31

Issue

Section

Articles

How to Cite

Symptom analysis of multidimensional categorical data with applications. (2020). Periodicals of Engineering and Natural Sciences, 8(3), 1517-1524. https://doi.org/10.21533/pen.v8.i3.1191