Search

Automatic feed phase identification in multivariate bioprocess profiles by sequential binary classification

메타 데이터

바이오화학분류
    • 바이오플라스틱
      1. 기타
    • 바이오정밀화학
      1. 기타
논문

Automatic feed phase identification in multivariate bioprocess profiles by sequential binary classification

학술지

Analytica chimica acta : an international journal devoted to all branches of analytical chemistry

저자명

Nikzad-Langerodi, Ramin; Lughofer, Edwin; Saminger-Platz, Susanne; Zahel, Thomas; Sagmeister, Patrick; Herwig, Christoph

초록

<P><B>Abstract</B></P> <P>In this paper, we propose a new strategy for retrospective identification of feed phases from online sensor-data enriched feed profiles of an <I>Escherichia Coli</I> (<I>E. coli</I>) fed-batch fermentation process. In contrast to conventional (static), data-driven multi-class machine learning (ML), we exploit process knowledge in order to constrain our classification system yielding more parsimonious models compared to static ML approaches. In particular, we enforce unidirectionality on a set of binary, multivariate classifiers trained to discriminate between adjacent feed phases by linking the classifiers through a one-way switch. The switch is activated when the actual classifier output changes. As a consequence, the next binary classifier in the classifier chain is used for the discrimination between the next feed phase pair <I>etc</I>. We allow activation of the switch only after a predefined number of consecutive predictions of a transition event in order to prevent premature activation of the switch and undertake a sensitivity analysis regarding the optimal choice of the (time) lag parameter. From a complexity/parsimony perspective the benefit of our approach is three-fold: i) The multi-class learning task is broken down into binary subproblems which usually have simpler decision surfaces and tend to be less susceptible to the class-imbalance problem. ii) We exploit the fact that the process follows a rigid feed cycle structure (i.e. batch-feed-batch-feed) which allows us to focus on the subproblems involving phase transitions as they occur during the process while discarding off-transition classifiers and iii) only one binary classifier is active at the time which keeps effective model complexity low. We further use a combination of logistic regression and Lasso (i.e. <I>regularized logistic regression</I>, RLR) as a wrapper to extract the most relevant features for individual subproblems from the whole set of high-dimensional sensor data. We train different soft computing classifiers, including <I>decision trees</I> (DT), <I>k-nearest neighbors</I> (<I>k</I>-NN), support <I>vector machines</I> (SVM) and an own developed <I>fuzzy classifier</I> and compare our method with conventional multi-class ML. Our results show a remarkable out-performance of the here proposed method over static ML approaches in terms of accuracy and robustness. We achieved close to error free feed phase classification while reducing the misclassification rates in 17 out of 20 investigated test cases in the range between 39% and 98.2% depending on feature set and classifier architecture. Models trained on features based on selection by RLR significantly outperformed those trained on features suggested by experts and their predictive performance was considerably less affected by the choice of the lag parameter.</P> <P><B>Highlights</B></P> <P> <UL> <LI> Dynamic classification framework for improved On-line feed phase identification in biochemical reactors. </LI> <LI> Regularized version of logistic regression within an all-pairs scheme for variable selection. </LI> <LI> Decomposition of the multi-class space into sequential binary problems, new enhanced fuzzy classification inference. </LI> <LI> Significant reduction of classification errors (>50%) compared to state-of-the-art, plain multi-class classification. </LI> <LI> Higher robustness with respect to under-represented classes (arising from short feed phases). </LI> </UL> </P> <P><B>Graphical abstract</B></P> <P>[DISPLAY OMISSION]</P>

발행연도

2017

발행기관

Elsevier

ISSN

0003-2670

ISSN

1873-4324

982

페이지

pp.48-61

주제어

Fermentation process; Bio-chemical reactors; Feed phase identification; Dynamic classification; Regularized logistic regression; Fuzzy classifier

0건의 논문이 있습니다.

0건의 특허가 있습니다.

0건의 무역이 있습니다.

1건의 후보군 물질이 있습니다.

1 2023-12-11

논문; 2017-08-01

Export

About

Search

Trend