An efficient, parallelized algorithm for optimal conditional entropy-based feature selection
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor | (LCC) Lab. Ciclo Celular | pt_BR |
dc.contributor | (CeTICS) Centro de Toxinas, Resposta-imune e Sinalização Celular | pt_BR |
dc.contributor.author | Estrela, Gustavo | pt_BR |
dc.contributor.author | Gubitoso, Marco Dimas | pt_BR |
dc.contributor.author | Ferreira, Carlos Eduardo | pt_BR |
dc.contributor.author | Barrera, Junior | pt_BR |
dc.contributor.author | Reis, Marcelo da Silva | pt_BR |
dc.date.accessioned | 2020-07-09T21:28:08Z | - |
dc.date.available | 2020-07-09T21:28:08Z | - |
dc.date.issued | 2020 | pt_BR |
dc.identifier.citation | Estrela G, Gubitoso MD, Ferreira CE, Barrera J, Reis MS. An efficient, parallelized algorithm for optimal conditional entropy-based feature selection. Entropy. 2020 Apr;22(4):492. doi:10.3390/e22040492. | pt_BR |
dc.identifier.uri | https://repositorio.butantan.gov.br/handle/butantan/3069 | - |
dc.description.abstract | In Machine Learning, feature selection is an important step in classifier design. It consists of finding a subset of features that is optimum for a given cost function. One possibility to solve feature selection is to organize all possible feature subsets into a Boolean lattice and to exploit the fact that the costs of chains in that lattice describe U-shaped curves. Minimization of such cost function is known as the U-curve problem. Recently, a study proposed U-Curve Search (UCS), an optimal algorithm for that problem, which was successfully used for feature selection. However, despite of the algorithm optimality, the UCS required time in computational assays was exponential on the number of features. Here, we report that such scalability issue arises due to the fact that the U-curve problem is NP-hard. In the sequence, we introduce the Parallel U-Curve Search (PUCS), a new algorithm for the U-curve problem. In PUCS, we present a novel way to partition the search space into smaller Boolean lattices, thus rendering the algorithm highly parallelizable. We also provide computational assays with both synthetic data and Machine Learning datasets, where the PUCS performance was assessed against UCS and other golden standard algorithms in feature selection | pt_BR |
dc.description.sponsorship | (CNPq) Conselho Nacional de Desenvolvimento Científico e Tecnológico | pt_BR |
dc.description.sponsorship | (FAPESP) Fundação de Amparo à Pesquisa do Estado de São Paulo | pt_BR |
dc.format.extent | 492 | pt_BR |
dc.language.iso | English | pt_BR |
dc.relation.ispartof | Entropy | pt_BR |
dc.rights | Open access | pt_BR |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | pt_BR |
dc.title | An efficient, parallelized algorithm for optimal conditional entropy-based feature selection | pt_BR |
dc.type | Article | pt_BR |
dc.rights.license | CC BY | pt_BR |
dc.identifier.doi | 10.3390/e22040492 | pt_BR |
dc.identifier.url | https://doi.org/10.3390/e22040492 | pt_BR |
dc.contributor.external | (USP) Universidade de São Paulo | pt_BR |
dc.identifier.citationvolume | 22 | pt_BR |
dc.identifier.citationissue | 4 | pt_BR |
dc.subject.keyword | machine learning | pt_BR |
dc.subject.keyword | supervised learning | pt_BR |
dc.subject.keyword | information theory | pt_BR |
dc.subject.keyword | mean conditional entropy | pt_BR |
dc.subject.keyword | feature selection | pt_BR |
dc.subject.keyword | classifier design | pt_BR |
dc.subject.keyword | support-vector machine | pt_BR |
dc.subject.keyword | U-curve problem | pt_BR |
dc.subject.keyword | boolean lattice | pt_BR |
dc.relation.ispartofabbreviated | Entropy | pt_BR |
dc.identifier.citationabnt | v. 22, n. 4, 492, abr. 2020 | pt_BR |
dc.identifier.citationvancouver | 2020 Apr;22(4):492 | pt_BR |
dc.contributor.butantan | Estrela, Gustavo|:|:LCC - Laboratório de Ciclo Celular|:PrimeiroAutor | pt_BR |
dc.contributor.butantan | Reis, Marcelo da Silva|:Pesquisador|:LCC - Laboratório de Ciclo Celular:Centro de Toxinas, Resposta-imune e Sinalização Celular (CeTICS)|:Autor de correspondência | pt_BR |
dc.sponsorship.butantan | (CNPq) Conselho Nacional de Desenvolvimento Científico e Tecnológico¦¦ | pt_BR |
dc.sponsorship.butantan | (FAPESP) Fundação de Amparo à Pesquisa do Estado de São Paulo¦¦2013/07467-1 | pt_BR |
dc.sponsorship.butantan | (FAPESP) Fundação de Amparo à Pesquisa do Estado de São Paulo¦¦2015/01587-0 | pt_BR |
dc.sponsorship.butantan | (FAPESP) Fundação de Amparo à Pesquisa do Estado de São Paulo¦¦2016/25959-7. | pt_BR |
dc.identifier.bvscc | BR78.1 | pt_BR |
dc.identifier.bvsdb | IBProd | pt_BR |
dc.description.dbindexed | Yes | pt_BR |
item.fulltext | Com Texto completo | - |
item.openairetype | Article | - |
item.languageiso639-1 | English | - |
item.grantfulltext | open | - |
crisitem.author.dept | #PLACEHOLDER_PARENT_METADATA_VALUE# | - |
crisitem.author.dept | #PLACEHOLDER_PARENT_METADATA_VALUE# | - |
crisitem.author.dept | #PLACEHOLDER_PARENT_METADATA_VALUE# | - |
crisitem.author.dept | #PLACEHOLDER_PARENT_METADATA_VALUE# | - |
crisitem.author.dept | #PLACEHOLDER_PARENT_METADATA_VALUE# | - |
crisitem.author.orcid | #PLACEHOLDER_PARENT_METADATA_VALUE# | - |
crisitem.author.orcid | #PLACEHOLDER_PARENT_METADATA_VALUE# | - |
crisitem.author.orcid | #PLACEHOLDER_PARENT_METADATA_VALUE# | - |
crisitem.author.orcid | #PLACEHOLDER_PARENT_METADATA_VALUE# | - |
crisitem.author.orcid | 0000-0002-3754-9115 | - |
crisitem.author.parentorg | #PLACEHOLDER_PARENT_METADATA_VALUE# | - |
crisitem.author.parentorg | #PLACEHOLDER_PARENT_METADATA_VALUE# | - |
crisitem.author.parentorg | #PLACEHOLDER_PARENT_METADATA_VALUE# | - |
crisitem.journal.journalissn | #PLACEHOLDER_PARENT_METADATA_VALUE# | - |
crisitem.journal.journaleissn | #PLACEHOLDER_PARENT_METADATA_VALUE# | - |
Appears in Collections: | Artigos |
Files in This Item:
This item is licensed under a Creative Commons License