sbcars - CBSoft 2013 - Universidade de Brasília
Transcrição
sbcars - CBSoft 2013 - Universidade de Brasília
Congresso Brasileiro de Software: Teoria e Prática 29 de setembro a 04 de outubro de 2013 Brasília-DF Anais SBCARS 2013 VII Simpósio Brasileiro de Componentes, arquiteturas e reutilização de software SBCARS 2013 VII Simpósio Brasileiro de Componentes, Arquiteturas e Reutilização de Software 30 de setembro e 1º de outubro de 2013 Brasília-DF, Brasil ANAIS Volume 01 ISSN: 2175-7356 Coordenador do Comitê de Programa do SBCARS 2013 Uirá Kulesza, DIMAp/UFRN COORDENAÇÃO DO CBSOFT 2013 Genaína Rodrigues – UnB Rodrigo Bonifácio – UnB Edna Dias Canedo - UnB Realização Universidade de Brasília (UnB) Departamento de Ciência da Computação Promoção Sociedade Brasileira de Computação (SBC) Patrocínio CAPES, CNPq, Google, INES, Ministério da Ciência, Tecnologia e Inovação, Ministério do Planejamento, Orçamento e Gestão e RNP Apoio Instituto Federal Brasília, Instituto Federal Goiás, Loop Engenharia de Computação, Secretaria de Turismo do GDF, Secretaria de Ciência Tecnologia e Inovação do GDF e Secretaria da Mulher do GDF SBCARS 2013 7th Brazilian Symposium on Software Components, Architectures and Reuse September 30 to october 1, 2013 Brasília-DF, Brazil PROCEEDINGS Volume 01 ISSN: 2175-7356 SBCARS 2013 Program Committee Chair Uirá Kulesza, DIMAp/UFRN CBSOFT 2013 gENERAL CHAIRS Genaína Rodrigues – UnB Rodrigo Bonifácio – UnB Edna Dias Canedo - UnB ORGANIZATION Universidade de Brasília (UnB) Departamento de Ciência da Computação PROMOTION Brazilian Computing Society (SBC) SPONSORS CAPES, CNPq, Google, INES, Ministério da Ciência, Tecnologia e Inovação, Ministério do Planejamento, Orçamento e Gestão e RNP SUPPORT Instituto Federal Brasília, Instituto Federal Goiás, Loop Engenharia de Computação, Secretaria de Turismo do GDF, Secretaria de Ciência Tecnologia e Inovação do GDF e Secretaria da Mulher do GDF Autorizo a reprodução parcial ou total desta obra, para fins acadêmicos, desde que citada a fonte SBCARS 2013 Apresentação Bem-vindos ao SBCARS 2013 – Simpósio Brasileiro de Componentes, Arquiteturas e Reutilização de Software. Esta é a 7a edição do simpósio promovido pela Sociedade Brasileira de Computação (SBC). O objetivo do evento é congregar pesquisadores, estudantes e desenvolvedores com ampla gama de interesses nas áreas de desenvolvimento baseado em componentes, arquitetura de software e reutilização de software. SBCARS 2013 é um evento integrante da 4o Congresso Brasileiro de Software: Teoria e Prática (CBSoft 2013), o qual também agrega outros três consolidados simpósios brasileiros: XXVII Simpósio Brasileiro em Engenharia de Software (SBES 2013), XVII Simpósio Brasileiro de Linguagens de Programação (SBLP 2013) e o XVI Simpósio Brasileiro de Métodos Formais (SBMF 2013). Em 2013, o SBCARS e o CBSoft acontecem em Brasília, Distrito Federal. Seguindo a tradição de anos anteriores, o programa técnico do SBCARS é composto por: (i) uma palestra de pesquisa convidada do professor Krzysztof Czarnecki (University of Waterloo, Canada); (ii) uma palestra industrial convidada de Joseph Yoder (The Refactory, Inc, USA); e (iii) sessões técnicas com artigos científicos que abordam pesquisas relacionadas às áreas de arquitetura, componentes e reutilização de software. O programa agrega duas interessantes palestras internacionais. A palestra do professor Krzysztof Czarnecki, da Universidade de Waterloo – Canadá, aborda o tema de “Variabilidade em Software: Estado-da-Arte e Direções Futuras”. Professor Czarnecki é um grande especialista mundial na área de linhas de produto de software e desenvolvimento generativo. Joseph Yoder, fundador e presidente da The Refactory, Inc, apresenta a palestra “Taming Big Balls of Mud with Diligence, Agile Practices, and Hard Work”. Ele tem grande experiência prática no desenvolvimento de projetos de software de larga escala, e de oferecer consultoria e treinamento para várias empresas internacionais. Ele também é presidente do Hillside Group, o qual tem como objetivo promover a qualidade no desenvolvimento de software, e agrega a comunidade de padrões de software ao redor do mundo. O programa técnico do SBCARS 2013 é também composto de 14 artigos científicos selecionados de um conjunto de 52 artigos submetidos. A taxa de aceitação do simpósio foi de cerca de 27%. Os artigos foram selecionados através de um processo de revisão rigoroso onde cada artigo foi revisto por 4 membros do comitê de programa. Em seguida, houve um período de discussão para que os revisores pudessem confrontar suas visões, impressões e opiniões dos artigos. A partir dos resultados das revisões e discussões, um conjunto de artigos foi então selecionado. É importante enfatizar a participação de pesquisadores estrangeiros e nacionais no comitê de programa, bem como a grande dedicação dos mesmos para oferecer comentários detalhados relacionados ao artigos. Os artigos aceitos serão apresentados durante o simpósio, distribuídos em 5 sessões técnicas. Além disso, eles serão publicados na biblioteca digital da IEEE. Agradeço a todos que contribuíram para a realização do simpósio. A qualidade do programa técnico é resultado da dedicação dos membros do comitê de programa do SBCARS 2013, juntamente com os revisores adicionais. Em relação a organização do simpósio, agradeço especialmente aos membros do comitê diretivo do SBCARS pelas valiosas dicas, especialmente Marcelo Fantinato e Eduardo Almeida, que compartilharam a experiência de organização de edições anteriores do simpósio. Finalmente, agradeço a Genaína Rodrigues e Rodrigo Bonifácio – coordenadores gerais do CBSoft 2013, pelo esforço, disponibilidade e valiosas contribuições para permitir que o congresso e o simpósio fossem realizados com sucesso. Agradeço também a Vander Alves pelo apoio nas atividades durante o simpósio. Esperamos que você aproveite o programa técnico do SBCARS 2013. Brasília, Setembro 2013. Uirá Kulesza – DIMAp/UFRN Coordenador do Comitê de Programa 5 SBCARS 2013 Foreword Welcome to SBCARs 2013 – Brazilian Symposium on Software Components, Architectures and Reuse. This is the 7th edition of the symposium promoted by the Brazilian Computing Society (SBC). The aim of the event is to gather researchers, students and practitioners with a wide range of interests in component based development, architectures and software reuse. SBCARS 2013 is an event part of the 4th Brazilian Conference on Software: Theory and Practice (CBSoft 2013), which will also host three other well-established Brazilian symposia: 27th Brazilian Symposium on Software Engineering (SBES 2013), 17th Brazilian Symposium on Programming Languages (SBLP 2013), 16th Brazilian Symposium on Formal Methods (SBMF 2013). This year, SBCARS and CBSoft 2013 will happen in Brasília, Distrito Federal. Following the tradition of previous years, the technical program of SBCARS 2013 have: (i) one keynote research talk by professor Krzysztof Czarnecki (University of Waterloo, Canada); (ii) one keynote industrial talk by Joseph Yoder (The Refactory, Inc, USA); and (iii) technical sessions with 14 papers exploring research work on several topics related to software architecture, components and reuse. We are happy to have two interesting international invited talks in SBCARS 2013. Professor Krzysztof Czarnecki (University of Waterloo, Canada) will present an invited research talk entitled “Variability in Software: State of the Art and Future Directions”. He is a well-known expert on software product lines and generative development. Joseph Yoder, founder and principal of The Refactory, Inc, will speak about “Taming Big Balls of Mud with Diligence, Agile Practices, and Hard Work”. He has great practical on the development of large-scale software projects and of providing consultancy and training for many international companies. He is also the President of The Hillside Group a group dedicated to improving the quality of software development, which aggregates the pattern community around the world. The technical program of SBCARS 2013 is also composed of 14 research papers selected from 52 submissions. The acceptance rate was about 27%. They were selected after a rigorous review process where each paper was reviewed by four members of the program committee. After that, there was a discussion period for reviewers to confront their views, impressions and opinion for the papers. Based on those reviews and discussions, a list of technical papers was selected. It is important to emphasize the participation of international and national members in the program committee, and their great dedication to provide detailed feedback for the authors of submitted papers. The accepted papers will be presented in five technical sessions. The SBCARS proceedings will be published in IEEE Digital Library. I would like to thank all who contributed to this event. The quality of the technical program is a result of the dedication of the members of the program committee of SBCARS 2013, together with additional reviewers. For the organization of the symposium, I would like to express my gratitude to the steering committee of SBCARS for their valuable support, specially Marcelo Fantinato and Eduardo Almeida, for kindly sharing their experience related to the organization of previous editions of the symposium. I also thank the great and hard work of the Genaína Rodrigues and Rodrigo Bonifácio – the CBSoft 2013 General Chairs, for their effort, availability and contributions in making this event possible. I also thank Vander Alves for his support during the symposium. We hope you enjoy the technical program of the SBCARS 2013. Brasília, September 2013. Uirá Kulesza – DIMAp/UFRN Program Chair 6 SBCARS 2013 Program Chair - Biography Uirá Kulesza é professor adjunto do Departamento de Informática e Matemática Aplicada (DIMAp), Universidade Federal do Rio Grande do Norte (UFRN), Brasil. Ele obteve seu doutorado em Informática pela PUC-Rio (2007), Brasil, tendo realizado doutorado sanduíche na University of Waterloo (Canadá) e Lancaster University (Inglaterra). Seus principais interesses de pesquisa são: engenharia de linhas de produtos de software, desenvolvimento generativo e arquitetura de software. Ele é co-autor de cerca de 120 artigos científicos em periódicos, conferências e livros. Ele atuou como pesquisador senior de pós-doutorado no projeto AMPLE (2007-2009) - Aspect-Oriented Model-Driven Product Line Engineering (www.ample-project.net) pela Universidade Nova de Lisboa, Portugal. Ele é atualmente pesquisador bolsista de produtividade nível 2 do CNPq. Uirá Kulesza is an Associate Professor at the Department of Informatics and Applied Mathematics (DIMAp), Federal University of Rio Grande do Norte (UFRN), Brazil. He obtained his DSc in Computer Science at PUC-Rio – Brazil (2007), in cooperation with University of Waterloo (Canada) and Lancaster University (UK). His main research interests include: software product lines, generative development and software architecture. He has co-authored over 120 referred papers in journals, conferences, and books. He worked as a post-doc researcher member of the AMPLE project (2007-2009) – AspectOriented Model-Driven Product Line Engineering (www.ample-project.net) at New University of Lisbon, Portugal. He is currently a CNPq (Brazilian Research Council) research fellow level 2. 7 SBCARS 2013 Comitês Técnicos / Program Committee COORDENADOR DO COMITÊ DE PROGRAMA Uirá Kulesza, DIMAp/UFRN Comitê Diretivo/Steering Committee Ana Paula Terra Bacelo, PUCRS Eduardo Santana de Almeida, UFBA e RiSE Marcelo Fantinato, EACH-USP Paulo Pires, UFRJ Uirá Kulesza, DIMAp/UFRN Comitê de Programa / Program Committee Alessandro Garcia, PUC-Rio, Brazil Alexandre Alvaro, UFSCar-Sorocaba, Brazil Alexandre Correa, Unirio, Brazil Aline Vasconcelos, IFF, Brazil Ana Paula Bacelo, PUCRS, Brazil Andres Diaz-Pace, UNICEN, Argentina Antonio Francisco Prado, UFSCar, Brazil Arndt von Staa, PUC-Rio, Brazil Camila Nunes, GE Global Research Center, Brazil Cecilia Rubira, Unicamp, Brazil Christina Chavez, UFBA, Brazil Claudia Werner, COPPE/UFRJ, Brazil Cláudio Sant’Anna, UFBA, Brazil Dalton Serey, UFCG, Brazil Daniel Lucrédio, UFSCar, Brazil David Weiss, Iowa State University, USA Edson Oliveira Junior, UEM, Brazil Eduardo Almeida, UFBA, Brazil Eduardo Guerra, INPE, Brazil Elder Cirilo, PUC-Rio, Brazil Elisa Yumi Nakagawa, ICMC-USP, Brazil Ellen Francine Barbosa, ICMC-USP, Brazil Fabiano Ferrari, UFSCar, Brazil Fernando Castor, UFPE, Brazil Flavia Delicato, UFRJ, Brazil Flavio Oquendo, European University of Brittany/IRISA-UBS, France Franklin Ramalho, UFCG, Brazil Genaina Rodrigues, UnB, Brazil Gibeon Aquino, UFRN, Brazil Gledson Elias, UFPB, Brazil Guilherme Travassos, COPPE/UFRJ, Brazil Ingrid Nunes, UFRGS, Brazil Itana Maria de Souza Gimenes, UEM, Brazil 8 SBCARS 2013 Jaejoon Lee, Lancaster University, UK Jeffrey Poulin, Lockheed Martin, USA Jobson Massollar, COPPE/UFRJ, Brazil John McGregor, Clemson University, USA José Maldonado, ICMC-USP, Brazil Kiev Gama, UFPE, Brazil Leonardo Murta, UFF, Brazil Lidia Fuentes, University of Málaga, Spain Lucineia Thom, UFRGS, Brazil Marcelo Eler, EACH-USP, Brazil Marcelo Fantinato, EACH-USP, Brazil Marcilio Mendonça, Black Berry, Canada Marcio Ribeiro, UFAL, Brazil Marco Aurelio Gerosa, IME-USP, Brazil Marco Tulio Valente, UFMG, Brazil Maurizio Morisio, Polytechnic University of Turin, Italy Nelio Cacho, UFRN, Brazil Nelson Rosa, UFPE, Brazil Oliver Hummel, University of Mannheim, Germany Padraig O’Leary, Dublin City University, Ireland Paris Avgeriou, University of Groningen, The Netherlands Patricia Machado, UFCG, Brazil Paulo Merson , Carnegie Mellon University – USA, TCU – Brazil Paulo Pires, UFRJ, Brazil Regina Braga, UFJF, Brazil Rick Rabiser, Johannes Kepler University of Linz, Austria Roberta Coelho, UFRN, Brazil Roberto Bittencourt, UEFS, Brazil Rodrigo Bonifacio, UnB, Brazil Rohit Gheyi, UFCG, Brazil Rosana Braga, ICMC-USP, Brazil Silvia Abrahão, Universitat Politècnica de València, Spain Thais Vasconcelos Batista, UFRN, Brazil Toacy Oliveira, COPPE/UFRJ, Brazil Uirá Kulesza, UFRN, Brazil Vander Alves, UnB, Brazil Vinicius Garcia, UFPE, Brazil REVISORES / referees Anderson Silva, UFPE, Brazil Andrea Magdaleno, COPPE/UFRJ, Brazil Francisco Monaco, ICMC-USP, Brazil Frank Affonso, UNESP, Brazil Frederico Lopes, UFRN, Brazil Gustavo de Souza Santos, UFMG, Brazil Hudson Borges, UFMG, Brazil Ivan do Carmo Machado, UFBA, Brazil Jair Leite, UFRN, Brazil Jose Miguel Horcas, University of Malaga, Spain 9 SBCARS 2013 Juliana Saraiva, UFPE, Brazil Katia Felizardo, USP, Brazil Lucas Bueno Ruas Oliveira, ICMC-USP, Brazil Marcelo Gonçalves, USP, Brazil Nadia Gomez, University of Malaga, Spain Nemésio Freitas Duarte Filho, ICMC-USP, Brazil Paulo Barbosa, UFCG, Brazil Paulo Mota Silveira, UFPE, Brazil Raphael Pereira de Oliveira, UFBA, Brazil Sergio Carvalho, UFG, Brazil Thiago Henrique Burgos de Oliveira, Fraunhofer IESE, Germany Vanius Zapalowski, UFRGS, Brazil Vitor Madureira Sales, UFMG, Brazil 10 SBCARS 2013 Palestras convidadas / invited keynotes Variabilidade em Software: Estado da Arte e Direções Futuras / Variability in Software: State of the Art and Future Directions Krzysztof Czarnecki, University of Waterloo, Canada Resumo. Variabilidade é um aspecto fundamental do software. É a habilidade de criar variantes do sistema para diferentes segmentos de mercado e contextos de uso. Variabilidade vem sendo investigada e explorada no contexto de linhas de produto de software, mas é relevante para outras áreas, tais como, ecosistemas de software e software sensível ao contexto. Variabilidade introduz complexidade para todas as áreas da engenharia de software, e demanda à criação de métodos e ferramentas baseados em variabilidade que possam lidar de forma efetiva com tal complexidade. A engenharia de um sistema com variabilidades requer a engenharia de um conjunto de sistemas simultaneamente. Como resultado, requisitos, arquitetura, código e testes são inerentemente mais complexas do que na engenharia de software tradicional. Métodos e ferramentas baseados em variabilidade promovem as similaridades entre variantes do sistema, enquanto também permitem a gerência efetiva das suas diferenças. Esta palestra irá analisar como a variabilidade afeta o ciclo de vida do software, focando em requisitos, arquitetura e verificação/validação, rever o estado-da-arte de métodos e ferramentas baseados em variabilidade, assim como identificar direções futuras. Abstract. Variability is a fundamental aspect of software. It is the ability to create system variants for different market segments or contexts of use. Variability has been most extensively studied in software product lines, but is also relevant in other areas, including software ecosystems and context-aware software. Variability introduces essential complexity into all areas of software engineering, and calls for variability-aware methods and tools that can deal with this complexity effectively. Engineering a variable system amounts to engineering a set of systems simultaneously. As a result, requirements, architecture, code and tests are inherently more complex than in single-system engineering. Variabilityaware methods and tools leverage the commonalities among the system variants, while managing the differences effectively. This talk will analyze how variability affects the software lifecycle, focusing on requirements, architecture and verification and validation, review the state of the art of variability-aware methods and tools, and identify future directions. TAMING BIG BALLS OF MUD WITH DILIGENCE, AGILE PRACTICES, AND HARD WORK Joseph Yoder, The Refactory, Inc, US Abstract. Big Ball of Mud (BBoM) architectures are viewed as the culmination of many design decisions that, over time, result in a system that is hodgepodge of steaming, smelly anti-patterns. Yet how BBoM architectures come into existence and successfully evolve is much more nuanced. BBoMs often do not result from well-intentioned design ideas gone wrong. Nor are they simply an accretion of expedient implementation hacks. Often BBoM systems can be extremely complex, with unclear and unstable architectural boundaries and requirements. Because of their complexity, BBoM architectures are likely 11 SBCARS 2013 not understood by any single mind. They typically are fashioned out of many parts, which together comprise a sprawling whole. So BBoM systems can and do have good, as well as bad and ugly parts. Successfully deployed BBoM systems continue to work well enough, in spite of their design flaws. How can that be? Much has happened in our industry since the original discussion and publication of the Big Ball of Mud paper in 1998. When that paper was originally published, agile methodologies such as eXtreme Programming and Scrum were just beginning to gain popularity. The growth and popularity of agile practices might be partially due to the fact that the state of the art of software architectures is so muddy. Being agile, with its focus on extensive testing and frequent integration, arguably makes it easier to deal with evolving architectures and keeping systems working while making significant improvements and adding functionality. However, although Agile has been around for quite some time, we can see that it still does not prevent Mud. This talk will examine the paradoxes that underlie Big Balls of Mud, what causes them, and why they are so prominent. I’ll explore what agile practices can help us avoid or cope with mud. I’ll also explain why continuous delivery and TDD with refactoring is not enough to help ensure clean architecture and why it is important to understand what is core to the architecture and the problem at hand. Understanding what changes in the system and at what rates can help you prevent becoming mired in mud. By first understanding where a system’s complexities are and where it keeps getting worse, we can then work hard (and more intelligently) at sustaining the architecture. Additionally, I’ll talk about some practices and patterns that help keep the code clean or from getting muddier The original Big Ball of Mud paper described some best practices such as SHEARING LAYERS and SWEEPING IT UNDER THE RUG as a way to help deal with muddy architectures. Additionally there are some other practices such as PAVING OVER THE WAGON TRAIL and WIPING YOUR FEET AT THE DOOR that can make code more habitable. 12 SBCARS 2013 PALESTRANTES / keynotes Krzysztof Czarnecki (University of Waterloo, Canada) Biografia. Krzysztof Czarnecki é professor de Engenharia da Computação e Elétrica da Universidade de Waterloo, Canadá. Ele foi pesquisador na Daimler Chrysler Research (1995-2002), Alemanha, atuando na melhoria de práticas e tecnologias de desenvolvimento de software para os domínios corporativo, automotivo, espacial e aeroespacial. Ele é co-autor do livro “Generative Programming” (AddisonWesley, 2000), o qual aborda a automação do desenvolvimento componentizado de software baseado em linguagens específicas de domínio. Como pesquisador na Universidade de Waterloo, ele foi Research Chair em Engenharia de Requisitos de Sistemas de Software Orientado a Serviços do NSERC/ Bank of Nova Scotia, e trabalhou em vários tópicos em engenharia de software dirigida por modelos, incluindo linhas de produto de software e modelagem de variabilidade, gerenciamento de consistência e transformações bidirecionais, e modelagem dirigida por exemplos. Ele recebeu os seguintes prêmios: Premier’s Research Excellence Award em 2004 e o British Computing Society in Upper Canada Award for Outstanding Contributions to IT Industry em 2008. Biografy. Krzysztof Czarnecki is a Professor of Electrical and Computer Engineering at the University of Waterloo. Before coming to Waterloo, he was a researcher at Daimler Chrysler Research (19952002), Germany, focusing on improving software development practices and technologies in enterprise, automotive, space and aerospace domains. He co-authored the book on “Generative Programming” (Addison- Wesley, 2000), which deals with automating software component assembly based on domainspecific languages. While at Waterloo, he held the NSERC/Bank of Nova Scotia Industrial Research Chair in Requirements Engineering of Service-oriented Software Systems (2008-2013) and has worked on a range of topics in model-driven software engineering, including software-product lines and variability modeling, consistency management and bi-directional transformations, and example-driven modeling. He received the Premier’s Research Excellence Award in 2004 and the British Computing Society in Upper Canada Award for Outstanding Contributions to IT Industry in 2008. JOSEPH YODER (The Refactory, Inc, US) Biografy. Joseph Yoder is a founder and principal of The Refactory, Inc., a company focused on software architecture, design, implementation, consulting and mentoring on all facets of software development. Joseph is an international speaker and pattern author, long standing member of the ACM, and the President of The Hillside Group, a, a group dedicated to improving the quality of software development. Joseph specializes in Architecture, Analysis and Design, C#, Java, Smalltalk, Patterns, Agile Methods, Adaptable Systems, Refactoring, Reuse, and Frameworks. Joe is the author of many patterns, including being an author of the Big Ball of Mud pattern, which illuminates many fallacies in the approach to software architecture.Joe currently resides in Urbana, Illinois. He teaches Agile Methods, Design Patterns, Object Design, Refactoring, and Testing in industrial settings and mentors many developers on these concepts. He currently oversees a team of developers who have constructed many systems based on enterprise architecture using the .NET environment. Other projects involve working in both the Java and .NET environments deploying Domain-Specific Languages for clients. 13 SBCARS 2013 Joe presented tutorials and talks, arranges workshops, and organizes leading technical conferences held throughout the world, including international conferences such as Agile, Agile Portugal, Encontro ¡gil, AOSD, CBSoft, JAOO, QCon, PLoP, AsianPLoP, SugarLoafPLoP, OOPSLA, ECOOP, and SPLASH. Joe thinks software is still too hard to change. He wants do something about this and believes that with good patterns and by putting the ability to change software into the hands of the people with the knowledge to change it seems to be on promising avenue to solve this problem. 14 SBCARS 2013 Indice / Table of Contents A Feature-Driven Requirements Engineering Approach for Software Product Lines 17 Raphael Pereira de Oliveira (UFBA), Emilio Insfran (UPV), Silvia Abrahão (UPV), Javier Gonzalez-Huerta (UPV), David Blanes (UPV), Sholom Cohen (UPV), Eduardo Santana de Almeida (UFBA) Improving modular reasoning on preprocessorbased systems 27 Jean Melo (UFPE), Paulo Borba (UFPE) Software Variability Management: An Exploratory Study with Two Feature Modeling Tools 36 Juliana Pereira (UFMG), Carlos Souza (UFMG), Ramon Abílio (UFLa), Gustavo Vale (UFLa), Eduardo Figueiredo (UFMG), Heitor Costa (UFLa) Evaluating the Propagation of Exceptions in the Service Oriented Architecture in .NET 46 José Alex Lima (UFRN), Eliezio Soares (UFRN), José Sueney (UFRN), Nélio Cacho (UFRN), Roberta Coelho (UFRN), Umberto Costa (UFRN) Mapeamento Sistemático de Computação Orientada a Serviços no Contexto de QoS 55 Danilo Mendonça (UnB), Genaina Rodrigues (UnB), Maristela Holanda (UnB), Aletéia P. F. Araújo (UnB) aCCountS: Uma arquitetura orientada a serviços para FLexibilizar a tarifação em nuvens de infraestrutura 65 Nayane Ponte (UFC), Fernando Trinta (UFC), Ricardo Viana (UFC), Rossana Andrade (UFC), Vinicius Garcia (UFPE), Rodrigo Assad (USTO.RE) AspectJ-based Idioms for Flexible Feature Binding 75 Rodrigo Andrade (UFPE), Henrique Rebêlo (UFPE), Marcio Ribeiro (UFAL), Paulo Borba (UFPE) Towards Validating Complexity-based Metrics for Software Product Line Architectures Anderson Marcolino (UEM), Edson Oliveira Junior (UEM), Itana Maria de Souza Gimenes (UEM), Tayana Conte (UFAM) 15 85 SBCARS 2013 A SPL infrastructure for supporting scientific experiments in petroleum reservoir research field 95 Fernanda Foschiani (Unicamp), Leonardo Tizzei (IBM Research), Cecília Rubira (Unicamp) BISTFaSC: An Approach To Embed Structural Testing Facilities Into Software Components 105 Marcelo Eler (USP), Paulo Masiero (USP-SC) Using Thesaurus-Based Tag Clouds to Improve TestDriven Code Search 115 Otavio Lemos (UNIFESP), Gustavo Konishi (Touch), Adriano Carvalho (UNIFESP), Joel Ossher (Palantir Technologies), Cristina Lopes (University of California, Irvine), Sushil Bajracharya (Black Duck Software) 125 MTP: Model Transformation Profile Ana Patricia Fontes Magalhães (UFBA), Aline Andrade (UFBA), Rita Suzana Pitangueira Maciel (UFBA) A Model-Driven Infrastructure for Developing Product Line Architectures Using CVL 135 Amanda S. Nascimento (Unicamp), Cecília Rubira (Unicamp), Rachel Burrows (University of Bath), Fernando Castor (UFPE) A Reference Architecture based on Reflection for Self-adaptive Software Frank José Affonso (UNESP), Elisa Yumi Nakagawa (ICMC-USP) 16 145 A Feature-Driven Requirements Engineering Approach for Software Product Lines Raphael Pereira de Oliveira1, Emilio Insfran2, Silvia Abrahão2, Javier Gonzalez-Huerta2, David Blanes2, Sholom Cohen3, Eduardo Santana de Almeida1,4 1 Department of Computer Science, Federal University of Bahia (UFBA), Salvador, Brazil Department of Information Systems and Computation, Universitat Politècnica de València (UPV), Valencia, Spain 3 Software Engineering Institute (SEI), Carnegie Mellon University, Pittsburgh, USA 4 Fraunhofer Project Center (FPC) for Software and Systems Engineering, Brazil Email: {raphaeloliveira, esa}@dcc.ufba.br, {einsfran, sabrahao, jagonzalez, dblanes}@dsic.upv.es, [email protected] 2 deploy features must also be specified in a systematic way by establishing explicit traceability between features and requirements. In the Application Engineering process, the RE activities are intended to specify the requirements for a particular product in the product family. Therefore, it is important to determine which requirements from the product line are relevant to the product to be developed (common and variant feature selection), and also to refine or to add new specific requirements, which are not in the product line (delta requirements). Abstract—The importance of Requirements Engineering within software development has long been established and recognized by researchers and practitioners. Within Software Product Lines (SPL), this activity is even more critical because it needs to cope with common, variable, and product-specific requirements not only for a single product but for the whole set of products in the family. In this paper, we present a FeatureDriven Requirements Engineering approach (FeDRE) that provides support to the requirements specification of software product lines. The approach follows a strategy where features are realized into functional requirements considering the variability captured in a feature model. It also provides guidelines on how to associate chunks of features from a feature model and to consider them as the context for the Use Case specification. The feasibility of the approach is illustrated in a case study for developing an SPL of mobile applications for emergency notifications. Preliminary evaluations on the perceived ease of use and perceived usefulness of requirements analysts using the approach are also presented. Most of the approaches dealing with requirements engineering in SPL development tend to include variability information in traditional requirements models (e.g., use case diagrams) [20] or to extract feature models [16] from requirements specifications following a bottom-up strategy [6] [21]. Some limitations of these approaches arise from the possibility of a large number of requirements and features making the specification of requirements hard to understand, maintain and prone to inconsistencies. The contribution of our approach is that we circumscribe the requirements specifications to deal with complexity in a more effective way. Effectiveness is achieved by chunking the requirement activity based on areas of the feature model. This constrains the extent of the requirements specification at any one time to a more specific area of the product line. We use the feature model as a basis mainly due to the fact that in the SPL community, features are first-class citizens, which are easily identifiable, well-understood, and easy to communicate by SPL developers and domain experts. Hence, there is a strong need to define traceability links between these features and the requirements, and whenever possible, to keep the model and specification synchronized and consistent [4] [5] [15]. Keywords—Product Lines; Requirements Specification; Reuse; I. INTRODUCTION Defining requirements to determine what is to be developed is generally accepted as a vital but difficult part of software development. Establishing the driving architectural requirements not only simplifies the design and implementation phases but also reduces the number of errors detected in later stages of the development process, reducing risk, duration and budget of the project [19]. In particular, the specification of requirements in Software Product Lines (SPL) [10] development is even more critical. In this context, it needs to cope with common, variable, and product-specific requirements not only for a single product but for the whole set of products in the family. A fundamental aspect of engineering SPLs is to apply Requirements Engineering (RE) practices to deal with scoping and specifying the product line in both the Domain Engineering and Application Engineering processes. In this paper we introduce a Feature-Driven Requirements Engineering (FeDRE) approach to help developers in the requirements engineering activity for SPL development. The approach proposes a set of artifacts, activities, roles, and guidelines on the basis of the features to be developed. Due to space constraints, we focus on the requirements specification of the Domain Engineering activity. We further focus our description of FeDRE by starting once a feature model has been defined (in the scoping activity). However, we do not deal with Quality Attributes (QAs) in the feature model. The next FeDRE In the Domain Engineering process, the RE activities are intended to define the extent of the product line to determine its products (scoping), and also to identify common, variable, and product-specific features throughout the product line. The specification of requirements needed to 17 activity consists of the systematic realization of features in terms of use cases. This activity specifies requirements but also establishes traceability between the features and the requirements. This, allows us to provide variability mechanisms at the requirements level (by using use cases and alternative scenarios) according to the chunk of the feature model that these requirements specify. The main contributions of FeDRE is a RE approach that 1) systematically realizes features into requirements considering the variability captured in the feature model and 2) breaks the top-down driven paradigm through use of the feature model to prioritize features according to architecturally significant areas of the product line. AMPLE project [4]. This approach presents two main contributions. First, the VML4RE language is a Domain Specific Language able to compose requirements with feature models; and second, a requirements process that uses: i) a feature model to perform the variability identification; ii) use cases and activity diagrams to describe the SPL domain requirements; and a iii) a VML4RE model to relate the feature model with the requirement models. These three models are taken as input with a feature configuration by an interpreter to obtain the application requirements. They also provide consistency checking between features and use case scenarios. An important difference with our approach is that FeDRE guides the requirements analyst in specifying requirements from the feature model, instead of having to focus on constructing the feature model to compose a requirements specification. The remainder of the paper is structured as follows. Section II discusses related work on SPL-specific requirements engineering approaches. Section III presents our feature-driven requirements engineering approach. Section IV illustrates the feasibility of the approach through a case study conducted to develop an SPL of mobile applications for emergency notifications. Finally, Section V presents our conclusions and future work. Other related work includes approaches that extend different requirements models such as use cases and scenarios with variability concepts without explicitly using feature modeling [8] [20]. The Pulse-CDA approach [8] is a subset of the Pulse methodology [7] that takes the information from the economic scope (a range of system characteristics and a scope definition) and then outputs a domain model (composed of a set of work products that capture different domain views) and a decision model. In Muthig et al. [22], the use-case technique is used as a work product to represent the variability in the use cases. Any element from the use case diagram or in the textual scenario can be a variant (e.g., an actor or a use case). The variant elements are enclosed with XML-style tags to explicitly mark variability. DREAM [20] is a different technique that extends traditional use cases to support variability, where the starting point is a set of legacy systems analyzed to extract the requirements. DREAM uses two stereotypes that are defined to represent variability in use case diagrams: «common» when the use case is always present for every product configuration and «optional» when the use case is present in some product configurations. In Pulse-CDA the decision model is traced to the variable elements in the use case and scenario description in order to instantiate the models. In FeDRE, this variability from use cases and scenarios is traced to the feature model through a traceability matrix. Neither PuLSE-CDA nor DREAM propose roles in their RE process to extend the requirements models to support variability. However, both approaches define the input and output artifacts in their processes. PulSE-CDA does not provide guidelines, but DREAM proposed a set of guidelines to obtain the domain requirements specification from legacy systems. II. RELATED WORK Over the last few years, several models and techniques that deal with the specification of requirements in SPL development have been proposed. Traditional requirements models, such as use case models [13] [14], scenario-based models [2] [4] [8] [20], UML models [3] [15] [25], or goal models [26], have been adapted and effectively used as an abstraction mechanism for modeling requirements in SPL development. One of the main differences between traditional RE and RE for product lines is the inclusion of commonality and variability in requirements identification and modeling. Some approaches combine feature models with more traditional requirements engineering techniques such as use cases [14] [13]. FeatuRSEB [14] proposes simultaneously building a use case model and a feature model, and afterwards performing the commonality and variability analysis first over the use case models and then over the feature model. PLUSS [13] improves the FeatuRSEB approach by adding more variability mechanisms: i) at the use case level; ii) at the alternative scenario level; iii) at the flow of events from an included alternative scenario; and, iv) with cross-cutting aspects that affect several use cases. Neither FeatuRSEB nor PLUSS propose roles or guidelines to help in the RE activity. In addition, input and output artifacts are only partially defined. In FeDRE, the feature model is the main artifact to model variability, and a use case model is built for chunks of this feature model in a systematic way. This improves our ability to deal with complexity by narrowing the context of the use case specification. Regarding the variability mechanisms in requirements, FeDRE borrows the first two types of variability from PLUSS (use case level, and alternative scenario). Another traditional RE technique is goal modeling. In Soltani et al. [26], the stakeholder’s intentions represented as goals are mapped to the software features to express their variability in an annotated feature model. In Aspect-oriented User Requirements Notation (AoURN) [21] the authors propose four main domain engineering activities: i) build a stakeholder goal model; ii) build a feature model, where features are represented as goal-tasks with the «feature» stereotype; iii) build the feature impact model to establish the impact of features on the stakeholder’s goals; and iv) The idea of combining a feature model with use cases was also used by the Variability Modeling Language for Requirements (VML4RE) approach in the context of the 18 create the feature scenario model, where non-leaf features are described in more detail with the Aspect-oriented Use Case Maps (AoUCM). In AoURN the traceability from features to requirements is done by using links from the stereotyped tasks in the feature model to the AoURN scenario model. These approaches do not define the roles in their processes and only provide partial guidelines for their use. Additionally, the input and output artifacts are only partially defined. Both proposals allow the RE expert to obtain a feature model from a previous goal model. In FeDRE, the starting point is a feature based on concepts that the domain expert directly works with, instead of unfamiliar goal models to guide the creation of the feature model. III. A FEATURE-DRIVEN REQUIREMENTS ENGINEERING APPROACH FOR SPL The Feature-Driven Requirements Engineering (FeDRE) approach for SPLs has been defined considering the feature model as the main artifact for specifying SPL requirements. The aim of the approach is to perform the requirements specification based on the features identified in the SPL domain in a systematic way using guidelines that establish traceability links between features and requirements. By domain, we mean the context where the family of products or functional areas across the products exhibits common, variable or specific functionalities. The main activities of the FeDRE approach are: Scoping, Requirements Specification for Domain Engineering, and Requirements Specification for Application Engineering. Figure 1 shows Domain Enginnering activities where FeDRE is applied. The following roles are involved in these activities: Domain Analyst, Domain Expert, Market Expert and the Domain Requirements Analyst. The Domain Analyst, Domain Expert and Market Expert perform the scoping activity and the Domain Requirements Analyst performs the requirements specification for domain engineering activity. Another alternative to specify requirements in SPL is to extend UML traditional notations with variability information. Shaker et al. [25] propose the Feature-Oriented Requirements Modeling Language (FORML) based on feature modeling and UML state-machines. FORML decomposes the requirements into the world model and the behavior model. In the world model, a feature model describes the features that compose the problem. One feature in the world model is decomposed into several feature modules in the behavior model. A feature module is represented with an UML-like finite state machine. This decomposition permits feature modularity, which is one of the main contributions of the work. FORML does not define roles and guidelines in the process in order to obtain the requirements specification. Comparing FORML and FeDRE, both approaches give support to modularity. FORML decomposes a feature model from the world model into several feature modules in the behavior model; similarly, FeDRE allows decomposing sets of features into functional requirements using use cases, scenarios, and traceability links. In summary, we analyzed several RE approaches for SPL development, and found a disparate set of approaches and techniques. In many cases the scoping and requirements specification activities are considered as independent activities. According to John et al. [18], well-defined relationships and interfaces between scoping and requirements artifacts should be defined in order to reduce rework. To alleviate this problem, FeDRE considers the scoping artifacts as the starting point and defines guidelines to conduct the SPL requirements specification driven by the scoping artifacts. Another important factor is the strategy followed to specify the requirements. Several approaches extend RE models such as use cases [13] [14] or goal models [6] [21] adapted to the SPL domain, and extract feature models from these RE models. In our view, SPL developers and domain experts are more familiar with the concept of feature and variability modeling; therefore, as a way to deal with complexity, we restrict the requirements specification in accordance with chunks of the feature model. Moreover, guidelines to specify functional requirements related to features are provided, thus resulting in an explicit traceability framework built in a systematic way. Fig 1. Overview of the FeDRE approach A. Scoping Scoping determines not only what products to include in a product line but also whether or not an organization should launch the product line. According to Bosch [9], the scoping activity consists of three levels: product portfolio scoping, domain scoping, and asset scoping. Product portfolio scoping determines which products and product features should be included in a product line. Domain scoping defines the functional areas and subareas of the product line domain, while Asset scoping identifies assets with costs and benefits estimated for them. In FeDRE, scoping results in three main artifacts: the Feature Model, 19 the Feature Specification, and the Product Map, using the Existing Assets (if any) as input artifact. These three artifacts will drive the SPL requirements specification for domain engineering. Each of these artifacts is detailed below. The FeDRE approach was defined using and extending the PLUSS approach [13], which represents requirements specifications as use case scenarios. The use case scenarios “force requirements analysts to always think about interfaces since separate fields exists for describing actor and system actions”. Our approach supports the relationship between features and use cases; thus, the feature variability is also expressed within the use cases. FeDRE differs from PLUSS in our approach to two types of variability: i) use case variability, considering the whole use case as a variant; and ii) scenario variability, where the variants are alternative scenarios of a use case. In our approach, these two types of variability are sufficient to capture the variations within SPL requirements, and do not require the steps variability nor cross-cutting parameters, as presented in [13]. 1) Existing Assets When performing an extractive or reactive SPL adoption [17], existing assets (e.g., user manual or existing systems) help the Domain Analyst and the Domain Expert identify the features and products in the SPL. Otherwise, a proactive approach can be followed to build the SPL from scratch. 2) Feature Model Feature modeling is a technique used to model common and variable properties, and can be used to capture, organize and visualize features in the product line. The Domain Analyst and the Domain Expert identify features using existing assets as input or by eliciting information from experts. A Feature Model diagram [16] will identify features, SPL variations, and constraints among the features in the product line. When a requirement is identified or refined, it should be determined if it is a shared requirement for different products in the product line, or if it is a specific requirement of a single product. Shared requirements must also be classified into common and variable requirements. Common requirements are used throughout the product line and variable requirements must be configured or parameterized in the specification of different variants of the product line. In addition, some requirements may require or exclude other requirements, or may restrict possible configurations of other requirements. Feature models may help in handling the different types of dependencies among requirements, which can be complex and must be properly addressed. 3) Feature Specification The Domain Analyst is responsible for specifying the features using a feature specification template. This template captures the detailed information of the features and keeps the traceability with all the involved artifacts. According to the template, each feature must have a unique identifier Feat_id and a Name. The values for the Variability field can be Mandatory, Optional, or Alternative according to the specified feature. The Priority of the feature should be High, Medium or Low. If the feature requires or excludes another feature(s), the Feat_id(s) from the required or excluded feature(s) must be specified. If the feature has a Parent Feature, the Feat_id from the parent feature must be specified. The Binding Time can be compile time or run time, according to the time that the feature will be included in a concrete product [11]. The Feature Type can be concrete or abstract, and the Description is a brief explanation of the feature. The Requirements Specification for Domain Engineering activity is usually performed in an iterative and incremental way. Therefore, sets of selected features from the Feature Model can be defined as units of increments for the specification (there may be different criteria for choosing features in a unit of increment, e.g., priority of implementation, cost, quality attributes). This activity uses the Feature Model, Feature Specification and Product Map as input artifacts and produces the Glossary, Functional Requirements and Traceability Matrix as output artifacts. Each of these artifacts is detailed below. 4) Product Map Each identified feature is assigned to the corresponding products in the SPL. The set of relationships among features and products produces the Product Map artifact, which describes all the required features for building a specific product in the SPL. It is usually represented as a matrix where columns represent the products and rows represent the features. 1) Glossary An important characteristic of software product line engineering is the presence of multiple stakeholders, domain experts, and developers. Thus, it is necessary to have a common vocabulary for describing the relevant concepts of the domain. The Glossary describes and explains the main terms in the domain, in order to share a common vocabulary among the stakeholders and to avoid misconceptions. It is represented as a two-column table containing the term to be defined and its description (see Table 2 in Section IV). All these artifacts are the input for the Requirements Specification for Domain Engineering activity, which is described below. 2) Functional Requirements This artifact contains all the functional requirements identified, common or variable, for the family of products that constitute the SPL. Use cases are used to specify the SPL functional requirements (each functional requirement is represented as a use case), and the required variations can be related to the use case as a whole or to alternative scenarios inside a use case. FeDRE adapts the template used in [13] for specifying functional requirements as use cases in order B. Requirements Specification for Domain Engineering This activity specifies the SPL requirements for domain engineering. These requirements allow realization of the features and desired products identified in the Scoping activity. The steps required to perform this activity are described in the Guidelines for Specifying SPL Requirements, Sub-Section C below. 20 • Actor: Represents an actor and can be related to other Actors or associated with UseCases; • Relationship: Represents the different types of relationships among UseCases, which are Include, Extend and Inheritance; • Scenario: Is an abstract metaclass used to represent the two types of scenarios for the UseCase, which are MainScenario and AlternativeScenario; • MainScenario: Represents the “normal flow” of steps for a UseCase; • AlternativeScenario: Represents an alternative set of steps for a UseCase. It can be associated with a Feature to represent the variability in the scenario; • Step: Represents a step in the MainScenario or AlternativeScenario. In order to structure the guidelines to specify functional requirements, we address the following questions: i) Which features or set of features will be grouped to be specified by use cases? (In our future work, we intend to group features according to QAs) ii) What are the specific use cases for the feature or set of features? iii) Where should the use cases be specified? (when having a set of features in a hierarchy, do we specify the use cases for each individual feature or only for the parent features?) iv) How is the use case specified in terms of steps? to support the two types of variability. The specification of functional requirements follows the functional requirements template shown in Table 3 from Section IV. Each functional requirement has a unique Use_case_id, a Name, a Description, Associated Feature(s), Pre and PostConditions, and the Main Success Scenario. Additionally, a functional requirement can be related to an Actor and may have Include and/or Extend relationships to other use case(s). Extends relationships should describe a condition for the extension. For the Main Success Scenario and the Alternative Scenarios, there are Steps (represented by numbers), Actor Actions (representing an action from the actor) and Blackbox System Responses (representing a response from the system). In addition, for the Alternative scenarios, there is a Name, a Condition and optionally relations to affected features through the Associated Feature field. 3) Traceability Matrix The Traceability Matrix is a matrix containing the links among features and the functional requirements. The rows in the matrix show the features and the columns show the functional requirements, as shown in Table 4 in Section IV. This matrix is also useful for helping in the evolution of the requirements since each change in the feature model will be traced up to the requirements through the traceability matrix (and vice versa). The guidelines consider four types of feature variability that may be present in the feature model, as shown in Table 1. The steps are the following: C. Guidelines for Specifying SPL Functional Requirements The purpose of the guidelines is to guide the Requirements Analyst in the specification of SPL functional requirements for domain engineering. The guidelines are based on a meta-model (see Figure 2) that represents the concepts involved when specifying use cases with alternative scenarios and the relationships among them. The meta-model is used to keep the traceability among all the elements and to facilitate understanding. The meta-model comprises the following elements: TABLE 1. FEATURES VARIABILITY Mandatory Feature Optional Feature Alternative Feature (OR) (one or more feature(s) can be selected) Alternative Feature (XOR) (only one feature can be selected) • RequirementsSpecification: Is the container of all the elements in the specification; • Feature: It represents a feature from a variability model. Although it is not defined in this model, it is related to zero or many requirements; • Requirement: It is an abstract metaclass used to represent functional requirements; • UseCase: Represents a functional requirement. A UseCase is associated with a Feature, other UseCases through the include, extend or inheritance relationships, or with Actors. It contains a Main Scenario and zero or many Alternative Scenarios; • UseCasePackage: Is the container of a UseCaseDiagram; • UseCaseDiagram: Is a view for Actors, UseCases and Relationships; 1. Identifying Use Cases. 1.1. Identify which Features will be grouped together to be specified by uses cases. The group of features must belong to the same hierarchy in the feature model; 1.1.1. Root mandatory Features or intermediate mandatory Features must have UseCases; 1.1.2. Mandatory leaf Features may have UseCases or can be specified as alternative scenarios from UseCases in the parent Feature; 21 Fig 2. Meta-Model for SPL Requirements 1.2. 1.3. 1.4. 1.5. Actors (with the Inheritance and Association Relationships). 1.1.3. Intermediate alternative Features (XOR) must have UseCases; 1.1.4. UseCases identified for leaf alternative Features (XOR) should be specified as alternative scenarios from UseCases in the parent Feature; 1.1.5. Root optional Features or intermediate optional Features must have UseCases; 1.1.6. UseCases identified for leaf optional Features should be specified as alternative scenarios from UseCases in the parent Feature; 1.1.7. Intermediate alternative Features (OR) must have UseCases; 1.1.8. UseCases identified for leaf alternative Features (OR) should be specified as alternative scenarios from UseCases in the parent Feature; Identify what the specific UseCases for the feature or group of features are; 1.2.1. UseCases identified for children Features (that have the same parent) with similar behavior must be specified just once in the parent Feature (where); 1.2.2. For each identified UseCase update the Traceability Matrix; Each Feature, which has one or more identified UseCase, should have a UseCasePackage; Each UseCasePackage should have a UseCaseDiagram; Each UseCaseDiagram should have the identified UseCases for the Feature, Relationships and 2. Specifying Use Cases. 2.1. All the UseCase specifications should follow the specifying use cases with alternative scenarios template (how), as shown in Table 3 in Section IV; 2.2. All mandatory fields from the template must be filled; 2.3. Each Extend must have a condition; 2.4. Create the UseCase MainScenario and UseCase AlterativeScenario. Each step, represented here by a number, comprises the Actor Action and a Black Box Response from the system; 2.5. Each AlternativeScenario should have a name and a condition; 2.6. An AlternativeScenario can be additionally associated with a Feature. IV. C ASE STUDY An exploratory case study to assess the usefulness of FeDRE was performed by following the guidelines presented in [24]. The stages of the case study development are: design, preparation, collection of data, and analysis of data, each of which is explained below. A. Design of the case study The case study was designed by considering the five components that are proposed in [24]: purpose of the study, underlying conceptual framework, research questions to be addressed, sampling strategy, and methods employed. The purpose of the case study is to show the usefulness of applying FeDRE to specify SPL requirements for Domain Engineering. The conceptual framework that links the phenomena to be studied is the feature-driven approach 22 where the SPL requirements are specified as use cases based on the features. The research questions that are intended to be addressed are: 1) How the SPL development team perceives the FeDRE approach in practice? – Is it easy to use and useful? 2) What limitations does the FeDRE approach present? The sampling strategy of the case study is based on an embedded single-case design. To answer these questions, FeDRE was applied to develop an SPL for the mobile software called SAVi (http://goo.gl/1Q49O), which is an application that notifies and allows a mobile contact list to track a user in an emergency situation, sending to the contact list a code by SMS and email. We chose SAVi in order to apply FeDRE in a real SPL project using an extractive / reactive SPL adoption. The SPL development team, which applied FeDRE approach, was composed of three people responsible for the scoping activity and six people responsible for the requirements specification activity. One of the authors, with more than 15 years of experience in requirements engineering, was responsible for validating the created artifacts. The technique that was used to obtain feedback about the usefulness of FeDRE was a survey answered by Ph.D. students from Universitat Politècnica de València and Federal University of Bahia after applying the FeDRE approach in practice. Fig 3. Selected Features from the Feature Model TABLE 2. EXCERPT FROM THE GLOSSARY *Definition Contact It represents a person to be contacted in an emergency situation. It includes relevant information like e-mail, phone number, Facebook ID, Twitter ID. Contact List Collection of contacts sorted in an alphabetical order. Twitter Micro blogging service. It is a site where the user can share small messages and retrieve contacts to SAVi. User B. Preparation of the case study With regards to the Scoping activity, all the artifacts (i.e., Feature Model, Feature Specification and Product Map) were created by one domain analyst and one domain expert, who were also assisted by a scoping expert with more than 6 years of experience in SPL scoping activities. The marketing analysis was done based on other products, with similar purpose to that of SAVi, available at the AppStore1. Functionalities of these products were included into the SAVi feature model, in which 27 features were identified. Since the FeDRE approach is flexible to support the incremental requirements specification, a set of features was selected to be the driver of the first iteration for the requirements specification. The selection of these features was done based on which features are present in most of the products in the Product Map and are easier to be implemented. Figure 3 shows an excerpt of the Feature Model and the selected features for the first iteration. It represents the person who uses the application. *Mandatory fields. With the artifacts created by the Scoping activity (Feature Model, Feature Specification and Product Map) and the Glossary artifact created by the Requirements Specification for Domain Engineering activity it was possible to create the Functional Requirements and the Traceability Matrix artifacts by applying the guidelines for specifying SPL requirements. C. Collection of the data The data for this case study was collected during the Requirements Specification for Domain Engineering activity. For specifying the SPL Functional Requirements, four additional requirements analysts were recruited. They were asked to apply the guidelines for specifying SPL functional requirements (shown in Sub-Section III-C) in order to answer the following questions: i) Which features can be grouped to be specified by Use Cases (UC)?; ii) What are the specific use cases for the feature or set of features?; iii) Where the use case should be specified?; and iv) How each use case is specified in terms of steps? Each one of the 27 features from the feature model was specified according to the feature specification template. Also, during the Scoping activity, a list of products for the mobile application for emergency notifications domain was defined allowing the creation of the Product Map artifact. Regarding the Requirements Specification for Domain Engineering activity, two requirements analysts from the team created the Glossary artifact based on the artifacts that were created in the Scoping activity. A total of 16 relevant terms were identified for the domain. An excerpt of this artifact is shown in Table 2. 1 *Term a) Which features can be grouped to be specified by UC? This step analyzes all the features included in the unit of increment for the current iteration, and for these features (see Figure 3) the analyst has to decide which ones will be specified by use cases. According to the guidelines, the requirements analysts decided that the features Contact (according to the step 1.1.1 from the guideline), Add_Contact (according to the step 1.1.2 from the guideline) and Import_Contact (according to the step 1.1.7 from the guideline) would be specified as use cases. Since there are three ways of implementing an import contact (two optional: Facebook_Import and Twitter_Import; and one Help.me: http://goo.gl/hSWpq / Rescue Button: http://goo.gl/asli3 / Red Panic Button: http://goo.gl/FpVsk / RescueMe Now: http://goo.gl/pDY9o 23 mandatory: Phone_Import) the requirements analysts decided that those features will not be specified as use cases (according to steps 1.1.2 and 1.1.8 from the guideline) and, instead, they will be specified as alternative scenarios in the use case related to the feature Import_Contact. representing the variability is the reuse. Since the alternative scenarios are specified just once within a use case, several products can be instantiated reusing the same use case. Thus, different behaviors can arise from the same use case, for different products, depending on the selected features. The traceability matrix (features X UC) is shown in Table 4. b) What are the specific UC for the feature or set of features? After deciding which features will be specified as use cases, the requirements analysts identified what will be the use cases for each feature. Moreover, the Traceability Matrix is incrementally filled in with traceability information between the use case and the feature. For the Contact feature, the following use cases were identified: Show Contact, Delete Contact and Update Contact. For the Add_Contact feature, the following use cases were identified: Define Message Language and Add Contact. The Import_Contact feature included the following use cases: Retrieve Contacts and Import Contacts. For the whole SPL, a total of 28 use cases were identified. TABLE 3. RETRIEVE CONTACTS USE CASE SPECIFICATION *Use_case_id: UC013 *Name: Retrieve Contacts *Description: It retrieves a contact list from an external system *Associated feature: Import_Contact Actor(s) [0..*]: The system should be The contacts from an *Preallowed to *Postexternal system are condition: communicate with the condition: retrieved to the external system system Includes To: Extends From: *Main Success Scenario Step 1 c) Where the UC should be specified? Actor Action The logged user requests his/her retrieve contacts Alternative Scenario name: Since some use cases with similar behavior may be identified for distinct features that have the same parent, the requirements analysts will decide where to relocate the specification for this use case (this is to avoid the redundant specification of similar behavior). When this happens, the use case specification is performed just once at the parent feature level. As soon as all the use cases were identified for each feature, it is possible to start modeling the use cases. A use case package is created for each feature that will have use cases, and a use case diagram is created to include the use cases, actors and relationships among them. An example for the Import_Contact feature (use case package and diagram) is shown in Figure 4. For the whole SPL, a total of 13 packages and use case diagrams were specified. Retrieve via Facebook The logged user must be linked within an Facebook Condition: account Associated feature [0..1]: Facebook_Import Step Actor Action Blackbox System Response 1.1 [None] The System retrieves the Facebook contacts Alternative Scenario name: Retrieve via Twitter Condition: The logged user must be linked within an Twitter account Associated feature [0..1]: Twitter_Import Step Actor Action Blackbox System Response 1.1 [None] The System retrieves the Twitter contacts *Mandatory fields. TABLE 4. EXCERPT FROM THE TRACEABILITY MATRIX Twitter_Import Phone_Import Add_Contact (a) Blackbox System Response The System shows the retrieved contacts UC013 X X … D. Data Analysis For the data analysis, 6 requirements analysts, who are Ph.D. students with experience in requirements specification, from Universitat Politècnica de València and Federal University of Bahia, also applied the guidelines for specifying the functional requirements for the SAVi SPL individually. These analysts were different from the authors of the guidelines. After applying the guidelines, a meeting was organized to analyze the obtained results (functional requirements specification and traceability matrix) and the discrepancies were resolved with consensus. The purpose of the meeting was to identify possible problems with the guidelines. The data gathered helped us to fine-tune the definition of the guidelines. For instance, feature variability was made explicit in the guidelines facilitating the identification of which features need a use case specification and a new step was included in the guidelines (the update traceability matrix, step 1.2.2). (b) Fig 4. (a) UC Package and (b) UC Diagram (Feature Import_Contact) d) How each UC is specified in terms of steps? After identifying and relating the use cases to the features, the requirements analysts specified each one of the use cases taking into account the variations from the Feature Model. Table 3 shows the Functional Requirement specification for one of the use cases related to the feature Import_Contact, which is the Retrieve Contacts use case. This use case specification has the optional features Facebook_Import and Twitter_Import, which are specified as alternative scenarios. Thus, this use case specification handles the variability expressed in the Feature Model, where depending on the selected feature, an alternative scenario can be included in the use case specification. Another advantage of using alternative scenarios for After applying the guidelines, the 6 requirements analysts were asked to fill in a survey in order to answer the 24 two stated research questions (Sub-Section IV-A). This survey is available online (http://goo.gl/RRNzu0). To measure the perceived ease of use and perceived usefulness of the requirements analysts after applying the FeDRE approach (first research question), we relied on an existing measurement instrument proposed for evaluating requirements modeling methods based on user perceptions [1]. Specifically, we adapted two perception-based variables from that instrument, which were based on two constructs of the Technology Acceptance Model (TAM) [12]: the second research question. Both research questions are discussed as follows. 1) How the SPL development team perceives the FeDRE approach in practice? – Is it easy to use and useful? The first research question was addressed by defining the following hypotheses: TABLE 5. QUESTIONNAIRE FOR EVALUATING FEDRE Item Statement The FeDRE approach is simple and easy to follow Overall, the requirements specifications obtained by the FeDRE approach are easy to use PEOU3 It is easy for me to follow the guidelines proposed by FeDRE approach PEOU4 The guidelines for specifying SPL functional requirements are easy to learn I believe that FeDRE approach would reduce the time required to specify SPL requirements Overall, I found the FeDRE approach to be useful I believe that the SPL requirements specifications obtained with the FeDRE approach are organized, clear, concise and non-ambiguous PU1 PU2 PU3 PU4 I believe that the FeDRE approach has enough expressiveness to represent functional SPL requirements PU5 I believe that the FeDRE approach would improve my performance in specifying SPL functional requirements • H20: FeDRE is perceived as not useful, H21 = H20. To test the hypotheses, we verified whether the scores that the subjects assign to the constructs of the TAM are significantly better than the neutral score on the Likert scale for an item. The scores of a subject are averaged over the items that are relevant for a construct. We thus obtained two scores for each subject. The Kolmogorov–Smirnov test for normality was applied to the PEOU and PU data. As the distribution was normal, we used the One-tailed sample ttest to check for a difference in mean PEOU and PU for the FeDRE approach and the value 3. The results shown in Table 6 allow us to reject the null hypotheses, meaning that we corroborated empirically that the analysts perceived the FeDRE approach as being easy to use and useful. Table 5 shows the items defined to measure these perception-based variables. These items were combined in a survey, consisting of 9 statements. The items were formulated by using a 5-point Likert scale, using the opposing-statement question format. Various items within the same construct group were randomized, to prevent systemic response bias. PEOU and PU are measured by using four and five items in the survey, respectively. PEOU2 H10: FeDRE is perceived as difficult to use, H11 = H10; The hypotheses relate to a direct relationship between the use of the FeDRE approach and the users’ perceptions. Table 6 shows descriptive statistics for the perceived-based variables. The results of the survey showed that the subjects perceived FeDRE as easy to use and useful. This can be observed by the mean value for these variables (i.e., all of them are greater than the neutral score (i.e., the score 3)). • Perceived Ease of Use (PEOU): the degree to which a person believes that using FeDRE would be effort-free. This variable represents a perceptual judgment of the effort required to learn and use the FeDRE approach; • Perceived Usefulness (PU): the degree to which a person believes that FeDRE will achieve its intended objectives. This variable represents a perceptual judgment of the FeDRE approach effectiveness. Item PEOU1 • TABLE 6. DESCRIPTIVE STATISTICS AND 1-TAILED ONE SAMPLE T-TEST RANK FOR PERCEPTION-BASED VARIABLES PEOU PU x 3.71 4.14 σ 0.75 0.69 Mean Diff 0.71 1.14 t 2.500 4.382 p-value 0.024 0.0025 2) What limitations does the FeDRE approach present? Some limitations were identified after applying the FeDRE approach. Those limitations are related to approach scalability, productivity and effectiveness. Since the use of the approach in large projects is not yet clear, to overcome this scalability limitation we intend to apply FeDRE in new SPL domains with more features than the one used in this paper. To improve the approach productivity, a tool should be developed to support the specification and maintenance of the SPL requirements. Finally, to overcome the lack of effectiveness on the created artifacts, meetings among the requirements analysts should be performed in order to mitigate SPL requirements specification discrepancies. The use of multiple items to measure a same construct requires the examination of the reliability of the survey. This was done by using Cronbach’s alpha. The results of the reliability analysis are as follows: (PEOU = 0.764; PU = 0.760). Both constructs have an alpha value equal to, or greater than 0.7, which is a common reliability threshold [23]. As a result, we can conclude that the items in the survey are reliable. The survey also included three open questions in order to gather the opinions of the requirements analysts about the limitations of the approach and suggestions on how to make the FeDRE approach easier to use and more useful. These responses were used to answer V. CONCLUSIONS AND FURTHER WORK This paper introduced the FeDRE approach to support the requirements specification of SPLs. In this approach, chunks of features from a feature model are realized into functional requirements, which are then specified by use cases. The required requirements variations can be related to the use case as a whole or to alternative scenarios inside of a use case. A set of guidelines was provided to help SPL 25 developers to perform these activities and as a means to systematize the process and ensure a correct traceability between the different requirements artifacts. We believe that this approach provides a solution for dealing with the complexity of specifying SPLs with a large number of requirements and features. [6] Asadi M., Bagheri E., Gašević D., Hatala M., Mohabbati, B. 2011. Goal-driven software product line engineering. Proc. of the 2011 ACM Symposium on Applied Computing (NY, USA, 2011). 691-698. [7] Bayer, J., Flege, O., Knauber, P., Laqua, R., Muthig, D., Schmid, K., Widen, T., and DeBaud, J.-M. 1999. PuLSE: A Methodology to Develop Software Product Lines. In Proc. of the 1999 symposium on Software reusability (Los Angeles, CA, USA). ACM Press, 122-131. The feasibility of FeDRE was evaluated using a mobile application for emergency notifications case study. The results show that the analysts perceived the approach as easy to use and useful for specifying the functional requirements for this particular product line. However, the approach needs further empirical evaluation with larger and more complex SPLs. Such evaluation is part of our future work. We plan to apply FeDRE in the development of other SPLs at the domain and application engineering processes. In addition, we intend to extend the approach to cope with nonfunctional requirements, quality attributes in the feature model and explore the use of model-driven techniques to (partially) automate the guidelines to check the completeness and consistency of artifacts. [8] Bayer, J., Muthig, D., and Widen, T. 2000. Customizable domain analysis. In Proceedings of the First Int. Symp. on Generative and Component-Based Software Engineering, Springer 2001, 178–194. [9] Bosch, J Design and Use of Software Architectures: Adopting and Evolving a Product-Line Approach (Addison-Wesley, 2000). [10] Clements, P. and Northrop, L. 2007. Software Product Lines: Practices and Patterns, Addison-Wesley, Boston. [11] Czarnecki, K., Eisenecker, U.W. Generative Programming: Methods, Tools, and Applications. Addison-Wesley, 2000. [12] Davis F.D. Perceived Usefulness, Perceived ease of use and user acceptance of information technology, MIS Quarterly 13 3 1989 319–340. [13] Eriksson, M., Börstler, J., and Borg, K. 2005. The PLUSS approach domain modeling with features, use cases and use case realizations. In Proceedings on 9th International Conference on Software Product Lines (Rennes, France, September 26-29). Springer, 33–44. ACKNOWLEDGEMENTS This research work is funded by the Hispano-Brazilian Interuniversity Cooperation Program under ref. No.HBP-20110015 and the MULTIPLE project (TIN2009-13838) from the Spanish Ministry of Education and Science. This material is based upon work funded and supported by the DoD under Contract No. FA8721-05-C-0003 with Carnegie Mellon University for the operation of the Software Engineering Institute, a federally funded research and development center. This material has been approved for public release and unlimited distribution (DM0000547). This work was partially supported by the National Institute of Science and Technology for Software Engineering (INES2), funded by CAPES, CNPq and FACEPE, grants 573964/2008-4 and APQ-1037-1.03/08 and CNPq grants 305968/2010-6, 559997/2010-8, 474766/2010-1 and FAPESB. The authors also appreciate the value-adding work of all their colleagues Loreno Alvim, Larissa Rocha, Ivonei Freitas, Tassio Vale and Iuri Santos who make great contributions to the Scoping activity of FeDRE approach. [14] Griss, M. L., Favaro, J., and d’ Alessandro, M. 1998. Integrating feature modeling with the RSEB. 5th ICSR, pp. 76–85, Canadá. [15] Heidenreich, F., Sánchez, P., Santos, J., Zschaler, S., Alférez, M., Araújo, J., Fuentes, L., Kulesza, U., Moreira, A. and Rashid, A. 2010. Relating feature models to other models of a software product line: a comparative study of featuremapper and VML. In Transactions on aspect-oriented software development VII. SpringerVerlag, Berlin, Heidelberg 69-114. [16] Kang, K. C., Cohen, S. G., Hess, J. A., Novak, W. E., and Peterson, A. S. 1990. Feature-Oriented Domain Analysis (FODA) Feasibility Study. Technical report. Software Engineering Institute. [17] Krueger C.W. Easing the transition to software mass customization. In Proc. of the 4th Int, Workshop on Software Product-Family Engineering (Bilbao, Spain, October 3–5, 2001). Springer, 282–293. [18] John, I., Eisenbarth, M. 2009. A decade of scoping: a survey. Proceeding on the 13th SPLC (San Francisco, USA,). ACM, 31-40. [19] Jones C., Applied Software Measurement: Assuring Productivity and Quality, McGraw-Hill: New York, 1996. REFERENCES [20] Moon, M., and Chae, H. S. 2005. An approach to developing domain requirements as a core asset based on commonality and variability analysis in a product line. IEEE Trans. Softw. Eng. 31(7), 551–569. Member-Keunhyuk Yeom. [1] Abrahao S., Insfran E., Carsí J.A., and Genero M. 2011. Evaluating requirements modeling methods based on user perceptions: A family of experiments. Information Science. 181, 16, 3356-3378. [2] Alexander I., and Maiden N. 2004. Scenarios, Stories, Use Cases: Through the Systems Development Life-Cycle. John Wiley. [21] Mussbacher, G., Araújo, J., Moreira, A., and Amyot, D. 2011. AoURN-based Modeling and Analysis of Software Product Lines. Software Quality Journal 20 (3-4), 645-687. [22] Muthig, D., I. John, M. Anastasopoulos, T. Forster, J. Dorr, and K. Schmid. GoPhone – A Software Product Line in the Mobile Phone Domain. IESE. Tech report: 025.04/E. 2004. [23] Nunally J., Psychometric Theory, 2nd ed., McGraw-Hill, NY, 1978. [3] Alférez, M., Kulesza, U., Weston, N., Araujo, J., Amaral, V., Moreira, A., Rashid, A., and Jaeger, M. C.A. 2008. Metamodel for Aspectual Requirements Modeling and Composition. Tech report. Univ. Nova de Lisboa, Portugal. [4] Alférez, M., Lopez-Herrejon, R. E., Moreira, A., Amaral, V. and Egyed, A. 2011. Supporting consistency checking between features and software product line use scenarios. In Proc. of the 12th Int. conference on Top productivity through software reuse (ICSR'11), Klaus Schmid (Ed.). Springer-Verlag, Berlin, Heidelberg, 20-35. [24] Runeson, P., Host, M. 2009. Guidelines for Conducting and [5] Anquetil, N., Kulesza, U., Mitschke, R., Moreira, A., Royer, J., requirements modeling language. In Proc. on 20th IEEE Int. Requirements Engineering Conference, Chicago, USA, pp. 151-160. Reporting Case Study Research. Software Engineering. Empirical Software Engineering 14(2). [25] Shaker, P., Atlee, J. M., Wang, S. 2012. A feature-oriented Rummler, A. and Sousa, A.. A model-driven traceability framework for software product lines. Softw. Syst. Model. 9, 4, 427-451, 2010. 2 [26] Soltani, S., Asadi, M., Gasevic, D., Hatala, M., Bagheri, E. 2012. Automated planning for feature model configuration based on functional and non-functional requirements. Proc. on the 16th SPLC. (Salvador, Brazil). ACM, Vol. 1, 56-65. INES - http://www.ines.org.br 26 Improving modular reasoning on preprocessor-based systems Jean Melo Paulo Borba Informatics Center Federal University of Pernambuco Recife, Brazil Email: [email protected] Informatics Center Federal University of Pernambuco Recife, Brazil Email: [email protected] Abstract—Preprocessors are often used to implement the variability of a Software Product Line (SPL). Despite their widespread use, they have several drawbacks like code pollution, no separation of concerns, and error-prone. Virtual Separation of Concerns (VSoC) has been used to address some of these preprocessor problems by allowing developers to hide feature code not relevant to the current maintenance task. However, different features eventually share the same variables and methods, so VSoC does not modularize features, since developers do not know anything about hidden features. Thus, the maintenance of one feature might break another. Emergent Interfaces (EI) capture dependencies between a feature maintenance point and parts of other feature implementation, but they do not provide an overall feature interface considering all parts in an integrated way. Thus, we still have the feature modularization problem. To address that, we propose Emergent Feature Interfaces (EFI) that complement EI by treating feature as a module in order to improve modular reasoning on preprocessor-based systems. EFI capture dependencies among entire features, with the potential of improving productivity. Our proposal, implemented in an open source tool called Emergo, is evaluated with preprocessor-based systems. The results of our study suggest the feasibility and usefulness of the proposed approach. I. I NTRODUCTION A Software Product Line (SPL) is a set of softwareintensive systems that share a common, managed set of features satisfying the specific needs of a particular market segment or mission and that are developed from reusable assets [1]. The idea of SPL is the systematic and efficient creation of products based on strategic software reuse. By reusing assets, we can build products through features defined in accordance with customers’ requirements [2]. In this context, features are the semantic units by which we can distinguish product line variants [3]. Feature models define the legal combinations of features by representing commonalities and variabilities of a product line [4]. In order to implement features, developers often use preprocessors [5], [6], [7]. Conditional compilation directives like #ifdef and #endif encompass code associated with features. Although preprocessor is widespread used for building SPL, its usage can lead to obfuscated source code reducing comprehensibility and increasing maintenance costs (i.e. code pollution), as a result, becoming error-prone [8], [9]. Besides that, preprocessors do not provide support for separation of concerns. In the literature, #ifdef directives are even referred to as “ifdef hell” [10], [11]. 27 For this reason, Virtual Separation of Concerns (VSoC) [5] reduces some of the preprocessor drawbacks by allowing developers to hide feature code not relevant to the current maintenance task. VSoC provides to developer a way of focusing on one feature, which is important for his task at the moment [12]. In other words, VSoC is helpful to visualize a feature individually. However, this approach is not enough to provide feature modularization since a developer does not know anything about hidden features [13]. As a result, the developer might introduce some bugs in the system when he changes a variable or method of a determined feature. Features eventually share variables and methods. Several descriptions of feature interaction phenomena are given in the literature (e.g. in [14]). We refer to feature dependency whenever we have such sharing like when a feature assigns a value to a variable which is subsequently used by another feature. Thus, one change in one feature can lead to errors in others. Moreover, these errors can cause behavioral problems in the system [15]. In many cases, bugs are only detected in the field by customers postrelease when running a specific product with the broken feature [12]. In order to minimize these problems, researchers have proposed the concept of Emergent Interfaces (EI) [16] to capture the dependencies between a feature part that the programmer is maintaining and the others. This approach is called emergent because the interfaces emerge on demand and give information to developer about other feature pieces that can be impacted. EI still have the VSoC benefits. Yet, they do not provide an overall feature interface considering all parts in an integrated way. In other words, EI have just captured dependencies among parts of a feature (not the feature as a whole) because they only know about the existing dependencies to one determined code encompassed with #ifdef. Likely, a feature is scattered across the source code and tangled with code of other features (through preprocessor directives) [17]. This way, each #ifdef represents one piece of the feature. Thus, there is no a complete understanding of a given feature. As a consequence, the programmer still might introduce errors in the system since he may not be aware of all dependencies of a given feature that he is maintaining. To sum up, EI only capture dependencies of one feature piece at a time. To address this problem, we propose Emergent Feature Interfaces (EFI) that complement EI because instead of knowing about feature parts, EFI see a feature as a “component” which has provided and required interfaces in order to improve mod- ular reasoning on preprocessor-based systems by looking for feature dependencies. Our proposal, implemented in an open source tool called Emergo,1 is evaluated with preprocessorbased systems. The results of our study bring preliminary evidence about the feasibility and usefulness of the proposed concept. The rest of the paper is structured as follows. Section II presents some motivating examples of real scenarios. Then, in Section III, we describe our proposal as well as the architectural design. After that, we discuss the case study in Section IV. Section V discusses about related work. Finally, in Section VI, we draw some conclusions and give directions for future work. II. M OTIVATING E XAMPLE Emergent Interfaces reduce Virtual Separation of Concerns problems by capturing data dependencies between a feature maintenance point and parts of other feature implementation of a software product line [16]. Using this approach, the developer can maintain a feature part being aware of the possible impacts in others [15]. Nevertheless, we show that EI are not enough to provide feature modularization, which aims at achieving independent feature comprehensibility, changeability, and development [13]. In this context, we present two maintenance scenarios in order to illustrate the problems mentioned in Section I and addressed in this paper. First, consider a JCalc2 -based product line of a standard and a scientific calculator written in Java. The code of the product line is available here.3 We transform the JCalc project in a SPL in order to utilize it only as running example that has mandatory, optional and alternative features, that is why we do not consider this product line in our case study. The JCalc class contains the main method responsible for executing the calculator (see Figure 1). As we can see, this method has three features: PT BR, GUI, and LOOKANDFEEL. Notice that the features are tangled along the main method. Also, the GUI feature is scattered across the method twice. Now, suppose a developer needs to maintain the GUI feature. First of all, he should look at the current feature to understand the role of the feature in its entirety, achieving independent feature comprehensibility. To do so, the developer must get the EI for each code encompassed with #ifdef GUI. The interfaces related to the first and second #ifdef GUI statements are shown in Figure 2 respectively. The first emerged information alerts the developer that he should also analyze the hidden LOOKANDFEEL feature whereas VSoC hides feature code not relevant to the current maintenance task, in this case, maintenance in the GUI feature. So, using EI the developer is aware of the dependencies that include the hidden features. It is important to stress that the GUI feature is scattered in other classes as well. As we explain now, part of the information presented by the interface might be relevant to a developer, whereas other part might not be. For instance, the second emergent interface 1 https://github.com/jccmelo/emergo Fig. 1. Maintenance in the GUI feature Fig. 2. EI for #ifdefs GUI is irrelevant because these variables reported are not used in other features across the entire system. In addition, the first interface contains both important and not important information for the developer, since he wants to understand whether the code elements of the GUI feature impact other features. So, the second information (‘Provides frame to (GUI)’) from first interface is unnecessary since the developer only does not know anything about the feature LOOKANDFEEL. This means that the developer is aware of that the GUI feature uses the frame variable because GUI is being maintained by him. Thus, we have the polluted emergent interfaces problem. Besides this problem of polluted interfaces, EI have another limitation due to the amount of preprocessor directives per feature. We illustrate that with a second scenario extracted 2 http://jcalculator.sourceforge.net/ 3 http://twiki.cin.ufpe.br/twiki/bin/viewfile/SPG/Emergo?rev=1;filename= JCalc.zip 28 from the Vim editor.4 Vim is a highly configurable text editor built to enable efficient text editing. Figure 3 depicts the source code for syntax highlighting of the Vim editor. The highlight changed function translates the ‘highlight’ option into attributes and sets up the user highlights. According to Figure 3 the highlight changed function has too many preprocessor directives (#ifdefs) which represent USER HIGHLIGHT and FEAT STL OPT features. mation for maintaining the feature under his responsibility. For instance, in our example, the developer might overlook the information that the FEAT STL OPT feature provides hlt pointer to another feature, as shown in Figure 3. As a consequence, he might introduce bugs in some SPL variant by leading to late error detection [12], since we can only detect errors when we eventually happen to build and execute a product with the problematic feature combination. This means that the overall maintenance effort increases. All things considered, we have the second problem which is the lack of an overall feature interface for the whole feature, having only access to partial feature interfaces for each block of feature code. In addition to EI of feature parts, there is an information overload since the interfaces are computed one by one and, then, the developer might join them. This joining process might be expensive and further some of these interfaces might have duplicate information. In face of that, the developer is susceptible to consider code unnecessarily, wasting time. The following section describes how we address these problems. III. E MERGENT F EATURE I NTERFACES To solve the aforementioned problems, we propose the idea of Emergent Feature Interfaces (EFI), an evolution of the Emergent Interfaces approach, which consists of inferring contracts among features and capturing the feature dependencies by looking at the feature as a whole. These contracts represent the provided and required interfaces of each feature. With these interfaces per feature, we can detect the feature dependencies (using sensitive-feature data-flow analysis). We use a broader term for contracts than “Design by contract” proposed by Meyer [19] since we infer the contracts by analyzing the code. We establish contracts between the feature being maintained and the remaining ones through the interfaces. The concept of interfaces allows us to know what a given feature provides and requires from others. Considering the first interface of Figure 2, EI do not inform us about the required interfaces, but the GUI feature requires title variable from the PT BR feature. Therefore, we improve EI by adding required interfaces for computing the emergent feature interfaces. Fig. 3. The highlight changed function from the Vim editor In this context, suppose a developer should study the FEAT STL OPT feature in order to implement a user requirement that consists to change this feature. Thus, he has to make use of the emergent interfaces generated for each code block encompassed with #ifdef FEAT_STL_OPT. After that, he has to memorize or join all interfaces gotten previously. In this function, he would have to observe six interfaces (vide Figure 3). This becomes worse as the scattering increases. An analysis done in forty preprocessor-based systems claims that a signicant number of feature incur a high scattering degree and the respective implementation scatters possibly across the entire system [18]. This process is time consuming, leading to lower productivity. Because of the potentially hard work to get all emergent interfaces, the developer might forget some relevant infor4 http://www.vim.org/ 29 In addition to establishing contracts, EFI obtain all dependencies between the feature that we are maintaining and the remaining ones. These dependencies occur when a feature shares code elements, such as variables and methods, with others. In general, this occurs when a feature declares a variable which is used by another feature. For example, in our first motivating example (vide Figure 1), the title variable is initialized in the PT BR feature (alternative feature) and, subsequently, used in GUI (mandatory feature). Thus, we can have feature dependencies like alternative/mandatory, mandatory/optional, and so on. After capturing the feature dependencies, we use featuresensitive data-flow analysis [20]. In doing so, we keep dataflow information for each possible feature combination. To clarify the understanding, we present how emergent feature interfaces work. Consider the example with regard to JCalc product line of Section II, where a programmer is supposed to maintain the GUI feature. As our proposal derives from EI, the programmer still has to select the maintenance point but with a slight difference since he can also ask about a determined feature. In other words, the developer can select both a code block as a feature declaration (i.e. #ifdef FeatureName). The developer is responsible by the selection (see the dashed rectangle in Figure 4) which in this case is the #ifdef GUI. Then, we perform code analysis based on data-flow analysis to capture the dependencies between the selected feature and the other ones. Finally, the feature interface emerges. Fig. 4. of seeing it part by part. In fact, it is easier to the developers understand the dependencies among features through a macro vision than get all interfaces one by one and, then, join them. For instance, in our second scenario (see Section II) there are several #ifdefs and, in special, the FEAT STL OPT feature is scattered across the highlight changed function (see Figure 3). Instead the developer having to repeat six times the #ifdef FEAT_STL_OPT selection for obtaining all interfaces we provide an integrated way to avoid this information overload. This way, the developer only need to select #ifdef FEAT_STL_OPT one time, then the data-flow analysis is performed and, finally, the feature interface emerged (as shown in Figure 5). Note that no dependencies found between FEAT STL OPT and the remaining ones. Again, reading this information the developer already knows that the FEAT STL OPT feature neither provides any variable or method nor requires to other ones. Emergent feature interface for the GUI feature The EFI in Figure 4 states that maintenance may impact products containing the LOOKANDFEEL feature. This means we provide the actual frame value to the LOOKANDFEEL feature and require the title value offered by the PT BR feature or its alternative one. Reading this information, the developer is now aware of the existing contracts between GUI and LOOKANDFEEL features and also between GUI and PT BR. The emerged information has been simplified because the developers only need to know the dependencies inter-feature, not intra-feature. That is, EFI only show dependencies between the feature he is maintaining and the remaining ones. We can check this in Figure 1 where the frame variable is used in other places but the emergent feature interface (see Figure 4) just exhibit frame dependency concerning the feature LOOKANDFEEL. The Figures 2 and 4 show the difference between EI and EFI clearly. Thus, the polluted emergent interfaces problem is addressed by cutting irrelevant information to programmer. This way, he focuses on information that should be analyzed avoiding considering code unnecessarily, wasting time. It is important to stress that EFI compute both direct and indirect dependencies among features. In other words, if a variable A is set in FEATURE 1 and used in FEATURE 2 to assign its value in other variable B (B = A). Finally, FEATURE 3 uses B. In this case, EFI alert to programmer that FEATURE 1 has a direct dependency to FEATURE 2 (through A) as well as FEATURE 2 to FEATURE 3 (through B). Besides that, EFI provide the indirect dependency between FEATURE 1 and FEATURE 3 since FEATURE 3 depends on FEATURE 1 indirectly (transitivity property). Note that the code might have many other #ifdefs making the interface’s construction more complex. According to the results presented by the authors of the concept of EI, the most methods have several preprocessor directives [15]. This means that it is better look at the feature as a whole instead 30 Fig. 5. Emergent feature interface for the FEAT STL OPT feature Therefore, our idea complements EI in the sense that we evolve this approach taking into account the feature as a module. As a result, the developer sees the feature dependencies through a interface more clean and precise that aids him to understand the impact of a possible change in the source code during preprocessor-based system maintenance. In that sense, interfaces enable modular reasoning [17]: We can understand a feature in isolation without looking at other features. Thus, EFI help to solve the feature modularization problem, since the programmer achieves independent feature comprehensibility and, consequently, he can change a feature code aware of its dependencies, avoiding breaking the contracts among features [13]. This improvement is feasible and useful to improve modular reasoning on preprocessor-based systems, with the potential of improving productivity. We present how EFI work in terms of implementation in the next section. A. Implementation To implement our approach, we built a tool called Emergo, an Eclipse plugin. It is available at: https://github.com/jccmelo/ emergo. The Figure 6 depicts both the Emergo’s architecture and the data-flow from developer’s selection up to the interface to appear. The architecture follows a heterogeneous architectural style based on the layered (GUI, Core, and Analysis) and independent components. To show the process for getting EFI, we explain step by step the activity diagram-like (as seen in Figure 6). First of all, the developer selects the maintenance point which indicates what code block or feature he is interested at the moment. Then, the GenerateEIHandler component sets up the classpath from the accessible information at the project. Besides that, it gets the compilation unit in order to know whether is a Java or Groovy project, treating each type of project in accordance with its peculiarities.5 Meanwhile, we associate the maintenance point selection with a feature expression and compute the Abstract Syntax Tree (AST) from Eclipse infrastructure. After, we mark each tree node that represents the selected code by the developer. This marking of the AST nodes from text selection is important to bind AST node on Soot’s Unit object later. Incidentally, Soot [21] is a framework for analysis and optimization of Java code. It accepts Java bytecode as input, converts it to one of the intermediate representations, applies analyses and transformations, and converts the results back to bytecode. Since we consider SPLs, we use the Soot framework to execute feature-sensitive data-flow analysis [20] and then capture feature dependencies. Before we apply data-flow analysis, the Soot component gets the classpath provided by the GenerateEIHandler and configures it by putting all bytecode in a specific place. Then, Soot loads the class that the developer is maintaining. This means, Soot gets the class bytecode and converts it into main intermediate representation of the Soot called Jimple, typed 3-address representation of Java bytecode. Jimple is suitable for decompilation and code inspection. In addition to load the class, we use the bridge design pattern to deal the difference between Java and Groovy independently. This way, we can bind AST nodes on Soot Units which correspond to statements. After this step, we have a mapping between AST nodes and statements and, hence, we are able to get the units in selection. This mapping is passed to Instrumentor that iterates over all units, look up for their feature expressions and add a new Soot Tag to each of them, and also computes all the feature expressions found in the whole body. Units with no feature expression receive an empty Soot Tag. The idea is to tag information onto relevant bits of code in order that we can then use these tags to perform some optimization in the dependency graph at the end. After the bytecode instrumentation, we build the Control Flow Graph (CFG) and, then, run reaching definitions analysis through the LiftedReachingDefinitions component that uses the Soot data-flow framework. The Soot data-flow framework is designed to handle any form of CFG implementing the interface soot.toolkits.graph.DirectedGraph. It is important to stress that our reaching definitions analyses are feature-sensitive data-flow analysis [20]. This way, we keep data-flow information for each possible feature combination. Then, the DependencyGraphBuilder component accepts mapping between AST nodes to statements, units in selection, CFG, and all possible feature combinations as input, iterates 5 http://groovy.codehaus.org/Differences+from+Java 31 over the CFG for creating the nodes from units in selection which can represent use or definition. If the node is a definition we get all uses and for each use found we create one directed edge on the dependency graph which represents the EFI. Otherwise, we just get its definition and connect them. Recalling that both paths support transitivity property. After the dependency graph is populated, we prune it to avoid duplicate edges and having more than one edge between two given nodes. Finally, the EmergoGraphView component shows the dependency graph in a visual way where the developer becomes aware of the feature dependencies, with the potential of improving productivity. Besides this graph view, we also provide a table view. These information alert the developer about what interfaces might be broken if he changes the code in specific places. B. Limitations and Ongoing work Our tool currently implements the general algorithm to emerge feature interfaces. The main limitation when computing interfaces happens when we have mutually exclusive features. Although the data-flow analysis is feature-sensitive, the Emergo still searches uses of a determined definition in all feature expressions, whether alternative or not. Improving this computation is an ongoing work. Also, we are working on interprocedural analysis for capturing dependencies among classes, packages, and components since a feature can be scattered in different places. IV. C ASE S TUDY To assess the effectiveness and feasibility of our approach, we conducted a case study following guidelines from Runeson and Host [22]. Our evaluation addresses these research questions: • RQ1: Is there any difference between Emergent Interfaces and Emergent Feature Interfaces? • RQ2: How do Emergent Feature Interfaces’ dependency detection compare to Emergent Interfaces? Our study includes five preprocessor-based systems in total. All of these software product lines are written in Java and contain their features implemented using conditional compilation directives. These systems contain several features. For instance, the lampiro product line has 11 features and, the mobile media has 14 features. Among these systems, Best lap and Juggling product lines are commercial products. We reuse the data produced by other research [15], whose authors proposed the EI concept. Table I shows the product lines, including their characteristics such as the amount of preprocessor directives utilized in the entire product line, among others. We count preprocessor directives like #ifdef, #ifndef, #else, #if, and so on. MDi stands for number of methods with preprocessor directives, for example, Mobile-rss has 244 methods with preprocessor directives (27.05%) from 902 existing methods in entire product line. And, MDe stands for number of methods with feature dependencies. In particular, we use MDe in order to select the methods with feature Fig. 6. Emergo’s architecture and activity diagram-like TABLE I. System Best lap Juggling Lampiro MobileMedia Mobile-rss C HARACTERISTICS OF THE EXPERIMENTAL OBJECTS . Version 1.0 1.0 10.4.1 0.9 1.11.1 MDi 20.7% 16.71% 2.6% 7.97% 27.05% MDe 11.95% 11.14% 0.33% 5.8% 23.84% # methods 343 413 1538 276 902 maintenance points, we use RANDOM.ORG6 that offers true random numbers to anyone on the Internet. # cpp directives 291 247 61 82 819 dependencies to answer our research questions. According to the presented data in Table I, these metrics vary across the product lines. Given a maintenance point in some of these product lines, we evaluate what the difference between EI and EFI. Our aim consists of understanding to what extent the latter complement the former. To answer these research questions, we randomly select a subset of methods with feature dependencies [15] and then compare the results produced by EFI to the results generated by EI. Also, note that the same set of selected methods is used to conduct the comparisons between EFI and EI. From these five experimental objects, we have 446 methods with preprocessor directives. We use a random way for getting ten methods that contains feature dependencies. Firstly, we decide to pick two methods per product line. Also, we randomly select the maintenance points. In doing so, we identify some valid maintenance points dismissing comment, whitespace and method/class declaration since our data-flow analysis is intraprocedural. Then, we compute EI and EFI for each maintenance point chosen. To select these methods and 32 After discussing the study settings, we present the results of our evaluation for each method with feature dependencies as shown in Table II. For each method selected, the table shows EI and EFI produced from the maintenance points. It is important to quote that depending on maintenance point selection the method might have not dependency among features. Although these ten methods contain feature dependencies, there are two cases where no dependencies were found, that is, these variables in selection are not neither used nor defined in another feature. As can be seen, EI return ‘No dependencies found!’ in all the cases that the maintenance point is not an assignment. In other words, this suggests that our claim regarding to miss required interfaces on EI is true. On the contrary, EFI take into consideration both provided and required interfaces. Yet, whenever some method has not feature dependency the two approaches give the same interface. For instance, there is no difference between EI and EFI for the Resource.save method (vide Table II). Although this method has no dependencies, EFI look for the definition of the playerID variable across the method in order to alert the developer the backward contract (required interface) between the feature he is maintaining and the other. In this case, EFI returned ‘No dependencies found!’ because playerID variable is defined at the same feature that contains its use. Otherwise, EFI would return ‘Requires playerID variable from feature X’ where X represents the 6 http://www.random.org/ feature name. This latter kind of case happened, for example, at the constructor of the class ResourceManager where EI did not find any dependencies whereas EFI found. Besides that, EI do not provide support for feature selection as a maintenance point. This is bad since the developer might want to understand a feature as a whole before applying any change the code. For example, consider the Best lap product line’ MainScreen.paintRankingScreen method, if the developer wanted to know what feature dependencies exist between the device screen 128x128 feature and the remaining ones, he should select all statements (one-by-one) within that feature. This is a potential hard work depending on the amount of statements of the feature. In this context, our approach is useful and feasible since EFI provide macro information per feature, improving modular reasoning on preprocessor-based systems (see the first line of the Table II). Another important aspect is the simplified view that EI do not offer to developers. For instance, the PhotoViewController.handleCommand method has the imgByte declaration encompasses with #ifdef sms || capturePhoto. This variable is used in different places (sms || capturePhoto and copyPhoto). EI show both use places in their message whereas EFI only alert the developer about dependencies outside the current feature configuration. This way, the developer just needs to worry with the copyPhoto feature since he is aware of the feature that he is maintaining. Thus, we believe that a simplified and global view helps the developer, with the potential of improving productivity. In summary, we believe that when the number of feature dependencies increases, our approach is better than EI because the probability of finding at least one required interface increases as well. In addition, when the number of feature dependencies increases EI might have too much information whereas EFI present a simplified view to the developer. At last, whenever the developer wants to see the feature dependencies of a specific feature EFI is the best option. Thus, the answer of the first research question is yes in cases where the maintenance point is not a assignment, including a particular occasion when the developer selects a feature such as #ifdef device_screen_128x128. The second question has already been responded along the previous paragraphs. A. Threats to validity To minimize selection bias, we randomly choose ten methods and the maintenance points. Yet, we get a subset of the product lines presented by Ribeiro et. al [15] in order to test our tool. For this, all five product lines selected are written in Java. Another threat is that we do not have access to the feature model of all SPLs, so the results can change due to feature model constraints, but we test both approaches (EI and EFI) of equal manner. In addition, we manually compute EI and EFI, as shown in Table II. This can contain some error, but we are very familiar with these approaches and we still revise each generated interface twice. We acknowledge the fact that more studies are required to draw more general conclusions. However, the results bring preliminary evidence about the feasibility and usefulness of the proposed approach. Although we cannot respond questions like “Is maintenance easier using our approach than using EI?” and, 33 “How feature dependencies impact on maintenance effort when using EI and EFI?” precisely, we believe that our study is an approximation to answer these questions because we provide more abstract and accurate interfaces than EI. V. R ELATED W ORK Many works investigate incorrect maintenance [23], [24], [25]. Sliwerski et al. [25] proposed a way to find evidence of bugs using repositories mining. They found that developers usually perform incorrect changes on Friday. Anvik et al. [26] applied machine learning in order to find ideal programmers to correct each defect. On the other hand, Gu et al. [23] studied the rate of incorrect maintenance in Apache projects. The proposed work helps in the sense of preventing errors during SPL maintenance, since the interface would show the dependencies between the feature we are maintaining and the remaining ones. Some researchers [27] studied 30 million code lines (written in C) of systems that use preprocessor directives. They found that directives as #ifdefs are important to write the code. But, the developers can easily introduce errors in the program. For example, open a parenthesis without closing it or even write a variable and then use it globally. The examples we focus on this paper show modularity problems that can arise when maintaining features in preprocessor-based systems. We analyze some software systems implemented with preprocessors and, then, we use these acquired knowledge to propose the Emergent Feature Interfaces concept. Also, we implement an Eclipse plug-in, Emergo, for helping the developers, avoiding breaking feature contracts. Emergent Interfaces [16] allow us to capture the dependencies among code snippets of distinct feature parts. This approach is called emergent because the interfaces emerge on demand and give information to developer about other feature pieces which can be impacted. However, this approach still leave of capture some feature dependencies [15]. Moreover, it has just captured dependencies among parts of a feature (not treating the feature as a whole). Our proposal complements this one by capturing dependencies among entire features by providing an overall feature interface considering all parts in an integrated way. Thus, EFI improve modular reasoning on preprocessor-based systems, with the potential of improving productivity. Recently some researchers [28] proposed analysis of exception flows in the context of SPL. For instance, a variable feature signaled an exception a different and unrelated variable feature handled it. When exception signalers and handlers are added to an SPL in an unplanned way, one of the possible consequences is the generation of faulty products. Our approach has a different manner for improving the maintainability of SPLs. We detect the feature dependencies by executing feature-sensitive data-flow analysis in order to improve modular reasoning on SPLs when evolving them. We do not consider implicit feature relation that comes about in the exceptional control flow since we just focus on dependencies among annotated features (with preprocessor directives). Finally, using the feature model, it is known that not all feature combinations produce a correct product. Depending on the problem domain, selecting a feature may require TABLE II. E VALUATION RESULTS . System Method Maintenance Point EI EFI Best lap MainScreen.paintRankingScreen #ifdef device screen 128x128 Do not provide support for this selection! Provides rectBackgroundPosX, rectBackgroundPosY, positionPosX, loginScorePosX, etc values to root feature. Best lap Resources.save dos.writeUTF(playerID); No dependencies found! No dependencies found! Juggling TiledLayer.paint firstColumn = (clipX this.x)/this.cellWidth; Provides firstColumn value to game tiledlayer optimize backbuffer feature. Provides firstColumn value to game tiledlayer optimize backbuffer feature. Juggling Resouces.load playerLogin = dis.readUTF(); No dependencies found! Requires dis variable from root feature. Lampiro ChatScreen.paintEntries int h = g.getClipHeight(); Provides h value to root feature. Provides h value to root feature and requires g variable from root feature. Lampiro ResourceManager.ResourceManager while ((b = is.read()) != -1) {...} No dependencies found! Requires is variable from GLIDER feature. MobileMedia PhotoViewController.handleCommand byte[] imgByte this.getCapturedMedia(); = Provides imgByte value to configurations: [sms || capturePhoto], and [copyPhoto && (sms || capturePhoto)]. Provides imgByte value to copyPhoto feature. MobileMedia SelectMediaController.handleCommand List down play.getDisplay(...); Dis- Provides down value to Photo, MMAPI and Video features. Provides down value to Photo, MMAPI and Video features. Mobile-rss UiUtil.commandAction m urlRrnItem = null; No dependencies found! No dependencies found! Mobile-rss RssFormatParser.parseRssDate logger.finest(”date=” + date); No dependencies found! Requires date variable from root feature. = or prevent the selection of others (e.g. alternative features). Features model is a mechanism for modeling common and variable parts of a SPL. Safe composition is used to ensure that all products of a product line are actually correct [29]. Thaker et al. [29] determined composition restrictions for feature modules and they used these restrictions to ensure safe composition. However, the developer does not know about any error in his maintenance before performing commit, since safe composition only catch errors after the maintenance task. Our approach differs from safe composition since we intend to use emergent feature interfaces to prevent errors when maintaining features. Moreover, some elements in our EFI deal with the system behavior (value assignment), rather than only with static type information. Nonetheless, safe composition is complementary because the developer may ignore the feature dependency showed by our approach and, then, introduce a type error. So, safe composition approaches catch it after the maintenance task. VI. C ONCLUSION This paper presents emergent feature interfaces which might be applied to maintain features in product lines in order to achieve independent feature comprehensibility by looking for entire feature dependencies. We provide an overall feature interface considering all parts in an integrated way. Features tend to crosscut a software product line’s code, and thus providing, or computing, a complete interface for them is difficult, if not impossible. The idea is to provide a global view of a feature by abstracting away irrelevant details. We focus on capturing data dependencies, but our proposal can be extended to compute other kinds of interfaces, including dependencies related to exceptions, control flows, and approximations of preconditions and postconditions. The feature modularization problem can be seen in any system, since features can be annotated (explicit) on code base or not (implicit). This way, our solution is over techniques for implementing features in a system. But, for the time being, 34 - our tool only runs on condition that features are implemented using conditional compilation. We also discusses our progress over EI by adding required interfaces, macro feature and simplified view of the existing dependencies the code, implemented in a tool called Emergo. After a selection, Emergo shows an EFI to the developer, keeping him informed about the contracts between the selected feature and the other ones, with the potential of improving productivity. Although we do not conduct a controlled experiment involving developers in order to claim more precisely whether using EFI is better than EI in terms of maintenance effort, we can claim (through our case study) that EFI have potential benefits to the developers leading to the productivity, since EFI’ interfaces present information more global and accurate. The results of our study, on five SPLs, suggest the feasibility and usefulness of the proposed approach where the minimum result is equals to EI. As future work, we intend to improve our tool with more robust emergent feature interfaces. Also, we are working to put interprocedural analysis on Emergo to capture feature dependencies among classes, packages and components. At last, we should conduct more studies, including a controlled experiment with developers, to draw more general conclusions. ACKNOWLEDGMENTS The authors would like to thank CNPq, FACEPE, and The National Institute of Science and Technology for Software Engineering (INES), for partially supporting this work. Also, we thank reviewers and SPG7 members for feedback and rich discussions about this paper. R EFERENCES [1] P. Clements and L. Northrop, Software Product Lines: Practices and Patterns. Addison-Wesley, 2002. 7 http://www.cin.ufpe.br/spg [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] K. Pohl, G. Böckle, and F. J. v. d. Linden, Software Product Line Engineering: Foundations, Principles and Techniques. Secaucus, NJ, USA: Springer-Verlag New York, Inc., 2005. S. Trujillo, D. Batory, and O. Diaz, “Feature refactoring a multirepresentation program into a product line,” in Proceedings of the 5th international conference on Generative programming and component engineering, ser. GPCE ’06. New York, NY, USA: ACM, 2006, pp. 191–200. [Online]. Available: http://doi.acm.org/10.1145/1173706. 1173736 K. C. Kang, S. G. Cohen, J. A. Hess, W. E. Novak, and A. S. Peterson, “Feature-Oriented Domain Analysis (FODA) Feasibility Study,” Carnegie-Mellon University Software Engineering Institute, Tech. Rep., November 1990. C. Kästner, S. Apel, and M. Kuhlemann, “Granularity in software product lines,” in Proceedings of the 30th international conference on Software engineering, ser. ICSE ’08. New York, NY, USA: ACM, 2008, pp. 311–320. [Online]. Available: http://doi.acm.org/10.1145/ 1368088.1368131 R. Kolb, D. Muthig, T. Patzke, and K. Yamauchi, “A case study in refactoring a legacy component for reuse in a product line,” in Software Maintenance, 2005. ICSM’05. Proceedings of the 21st IEEE International Conference on, sept. 2005, pp. 369–378. E. Alves, P. Matos, L. Cole, P. Borba, and G. Ramalho, “Extracting and Evolving Mobile Games Product Lines,” in Proceedings of SPLC’05, LNCS 3714. Springer-Verlag, 2005, pp. 70–81. H. Spencer, “ifdef Considered Harmful, or Portability Experience with C News,” in In Proc. Summer’92 USENIX Conference, 1992, pp. 185– 197. M. Krone and G. Snelting, “On the inference of configuration structures from source code,” in Proceedings of the 16th international conference on Software engineering, ser. ICSE ’94. Los Alamitos, CA, USA: IEEE Computer Society Press, 1994, pp. 49–57. [Online]. Available: http://dl.acm.org/citation.cfm?id=257734.257742 J. M. Favre, “Understanding-in-the-large,” in Proceedings of the 5th International Workshop on Program Comprehension (WPC ’97), ser. WPC ’97. Washington, DC, USA: IEEE Computer Society, 1997, pp. 29–. [Online]. Available: http://dl.acm.org/citation.cfm?id=523511. 837867 D. Lohmann, F. Scheler, R. Tartler, O. Spinczyk, and W. SchröderPreikschat, “A quantitative analysis of aspects in the ecos kernel,” in Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006, ser. EuroSys ’06. New York, NY, USA: ACM, 2006, pp. 191–204. [Online]. Available: http: //doi.acm.org/10.1145/1217935.1217954 C. Kästner and S. Apel, “Virtual Separation of Concerns - A Second Chance for Preprocessors,” Journal of Object Technology, vol. 8, no. 6, pp. 59–78, 2009. D. L. Parnas, “On the criteria to be used in decomposing systems into modules,” Commun. ACM, vol. 15, pp. 1053–1058, December 1972. [Online]. Available: http://doi.acm.org/10.1145/361598.361623 M. Calder, M. Kolberg, E. H. Magill, and S. Reiff-Marganiec, “Feature interaction: a critical review and considered forecast,” Comput. Netw., vol. 41, no. 1, pp. 115–141, Jan. 2003. [Online]. Available: http://dx.doi.org/10.1016/S1389-1286(02)00352-3 M. Ribeiro, F. Queiroz, P. Borba, T. Tolêdo, C. Brabrand, and S. Soares, “On the impact of feature dependencies when maintaining preprocessorbased software product lines,” in Proceedings of the 10th ACM international conference on Generative programming and component engineering, ser. GPCE ’11. New York, NY, USA: ACM, 2011, pp. 23– 32. [Online]. Available: http://doi.acm.org/10.1145/2047862.2047868 M. Ribeiro, H. Pacheco, L. Teixeira, and P. Borba, “Emergent feature modularization,” in Proceedings of the ACM international conference companion on Object oriented programming systems languages and applications companion, ser. SPLASH ’10. New York, NY, USA: ACM, 2010, pp. 11–18. [Online]. Available: http://doi.acm.org/10.1145/1869542.1869545 C. Kästner, S. Apel, and K. Ostermann, “The road to feature modularity?” in Proceedings of the 15th International Software Product Line Conference, Volume 2, ser. SPLC ’11. New York, NY, USA: ACM, 2011, pp. 5:1–5:8. [Online]. Available: http: //doi.acm.org/10.1145/2019136.2019142 35 [18] J. Liebig, S. Apel, C. Lengauer, C. Kästner, and M. Schulze, “An analysis of the variability in forty preprocessor-based software product lines,” in Proceedings of the 32nd International Conference on Software Engineering (ICSE), New York, NY, 5 2010, pp. 105–114. [19] B. Meyer, “Applying ”design by contract”,” Computer, vol. 25, no. 10, pp. 40–51, Oct. 1992. [Online]. Available: http://dx.doi.org/10.1109/2. 161279 [20] C. Brabrand, M. Ribeiro, T. Tolêdo, and P. Borba, “Intraprocedural dataflow analysis for software product lines,” in Proceedings of the 11th annual international conference on Aspect-oriented Software Development, ser. AOSD ’12. New York, NY, USA: ACM, 2012, pp. 13–24. [Online]. Available: http://doi.acm.org/10.1145/2162049. 2162052 [21] R. Vallée-Rai, P. Co, E. Gagnon, L. Hendren, P. Lam, and V. Sundaresan, “Soot - a java bytecode optimization framework,” in Proceedings of the 1999 conference of the Centre for Advanced Studies on Collaborative research, ser. CASCON ’99. IBM Press, 1999, pp. 13–. [Online]. Available: http://dl.acm.org/citation.cfm?id= 781995.782008 [22] P. Runeson and M. Höst, “Guidelines for conducting and reporting case study research in software engineering,” Empirical Softw. Engg., vol. 14, no. 2, pp. 131–164, Apr. 2009. [Online]. Available: http://dx.doi.org/10.1007/s10664-008-9102-8 [23] Z. Gu, E. T. Barr, D. J. Hamilton, and Z. Su, “Has the bug really been fixed?” in Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1, ser. ICSE ’10. New York, NY, USA: ACM, 2010, pp. 55–64. [Online]. Available: http://doi.acm.org/10.1145/1806799.1806812 [24] Z. Yin, D. Yuan, Y. Zhou, S. Pasupathy, and L. Bairavasundaram, “How do fixes become bugs?” in Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering, ser. ESEC/FSE ’11. New York, NY, USA: ACM, 2011, pp. 26–36. [Online]. Available: http://doi.acm.org/10.1145/2025113.2025121 [25] J. Śliwerski, T. Zimmermann, and A. Zeller, “When do changes induce fixes?” SIGSOFT Softw. Eng. Notes, vol. 30, pp. 1–5, May 2005. [Online]. Available: http://doi.acm.org/10.1145/1082983.1083147 [26] J. Anvik, L. Hiew, and G. C. Murphy, “Who should fix this bug?” in Proceedings of the 28th international conference on Software engineering, ser. ICSE ’06. New York, NY, USA: ACM, 2006, pp. 361– 370. [Online]. Available: http://doi.acm.org/10.1145/1134285.1134336 [27] J. Liebig, C. Kästner, and S. Apel, “Analyzing the discipline of preprocessor annotations in 30 million lines of C code,” in Proceedings of the tenth international conference on Aspect-oriented software development, ser. AOSD ’11. New York, NY, USA: ACM, 2011, pp. 191–202. [Online]. Available: http://doi.acm.org/10.1145/1960275. 1960299 [28] H. Melo, R. Coelho, and U. Kulesza, “On a feature-oriented characterization of exception flows in software product lines,” in Software Engineering (SBES), 2012 26th Brazilian Symposium on, 2012, pp. 121–130. [29] S. Thaker, D. Batory, D. Kitchin, and W. Cook, “Safe composition of product lines,” in Proceedings of the 6th international conference on Generative programming and component engineering, ser. GPCE ’07. New York, NY, USA: ACM, 2007, pp. 95–104. [Online]. Available: http://doi.acm.org/10.1145/1289971.1289989 Software Variability Management An Exploratory Study with Two Feature Modeling Tools Juliana Alves Pereira, Carlos Souza, Eduardo Figueiredo Ramon Abilio, Gustavo Vale, Heitor Augustus Xavier Costa Computer Science Department, Federal University of Minas Gerais (UFMG) Belo Horizonte, Brazil {juliana.pereira, carlosgsouza, figueiredo}@dcc.ufmg.br Computer Science Department, Federal University of Lavras (UFLA) Lavras, Brazil [email protected], [email protected], [email protected] [11]. This adoption is expected to bring significant improvements to the software development process [28]. Due to these expected benefits, large companies [41], such as Boeing, Hewlett Packard, Nokia, and Siemens, have moved forward towards adopting SPL practices. Abstract—Software Product Line (SPL) is becoming widely adopted in industry due to its capability of minimizing costs and improving quality of software systems through systematic reuse of software artifacts. An SPL is a set of software systems sharing a common, managed set of features that satisfies the specific needs of a particular market segment. A feature represents an increment in functionality relevant to some stakeholders. There are several tools to support variability management by modeling features in SPL. However, it is hard for a developer to choose the most appropriate feature modeling tool due to the several options available. This paper presents the results of an exploratory study aiming to support SPL engineers choosing the feature modeling tool that best fits their needs. This exploratory study compares and analyzes two feature modeling tools, namely FeatureIDE and SPLOT, based on data from 56 participants that used the analyzed tools. In this study, we performed a four-dimension qualitative analysis with respect to common functionalities provided by feature modeling tools: (i) Feature Model Editor, (ii) Automated Analysis of Feature Models, (iii) Product Configuration, and (iv) Tool Notation. The main issues we observed in SPLOT are related to its interface. FeatureIDE, on the other hand, revealed some constraints when creating feature models. Feature modeling tools are used to support the management of variability in an SPL. These tools support the representation and management of reusable artifacts instead of providing means for conventional development from scratch. There are many available options of feature modeling tools [17] [22] [23] [29] [26] [39]. Therefore, choosing one tool that best meets the SPL development goals is far from trivial. After surveying several tools for variability management of SPL, this paper presents a detailed qualitative analysis of two feature modeling tools, namely SPLOT [26] and FeatureIDE [23]. We choose to focus our analysis on these tools because they provide the key functionality of typical feature modeling tools, such as to create and edit a feature model, to automatically analyze the feature model, and to configure a product. In addition, we decide to exclude early prototypes [17] [39] and proprietary tools [29] from our study because they could hinder some sorts of analyses. In addition, early prototype tools (i) do not cover all relevant functionalities we aim to evaluate and (ii) are not applicable to industry-strength SPL. Keywords—Software Product Line, Feature Models, SPLOT, FeatureIDE. This exploratory qualitative study involved 56 young developers taking an advanced SE course. Each participant used only one tool, either SPLOT or FeatureIDE. We relied on a background questionnaire and a 1.5-hour training session to balance knowledge of the study participants. The experimental tasks included typical variability management functionalities. After that, participants answered a survey with open questions about the functionalities they used in each tool. I. INTRODUCTION The growing need for developing larger and more complex software systems demands better support for reusable software artifacts [21] [28]. In order to address these demands, Software Product Line (SPL) has been increasingly adopted in software industry [1] [41]. “An SPL is a set of software intensive systems sharing a common, managed set of features that satisfies the specific needs of a particular market segment or mission [28]”. It is developed from a common set of core assets and variable features [11]. “A feature represents an increment in functionality relevant to some stakeholders [22]”. The study results may contribute with relevant information to support software engineers to choose a tool for the development and management of variability in SPL that best fits their needs. We focus our analysis on four functionalities available in typical feature modeling tools: Feature Model Editor, Automated The high degree of similarity among software systems in a specific domain favors the adoption of SPL 36 Analysis of Feature Models, Product Configuration, and the Feature Model Notation used by each tool. Based on this analysis, we uncover several interesting findings about the analyzed tools. First, we observed that most participants like the user interface of both SPLOT and FeatureIDE. However, although shortcuts and automatic feature model organization work fine in FeatureIDE, they were considered an issue by many SPLOT users. With respect to automated analysis, both tools present statistical information, but SPLOT succeeds by presenting data in an easier way to understand. Participants seem to be happy with SPLOT because it provides steps to guide users during the product configuration. However, a configuration can be created in SPLOT, but it cannot be saved. FeatureIDE, on the other hand, allows us to create multiple product configurations, to save them, and to set the default one. Fig. 1. Mobile Media Feature Model. In addition to features and their relationships, a feature model can also include composition rules. A composition rule refers to additional cross-tree constraints to restrict feature combinations [13]. It is responsible for validating a combination of unrelated features. Typical cross-tree constraints are inclusion or exclusion statements in the form “if feature F1 is included, then feature F2 must also be included (or excluded)”. For example, the feature model in Fig. 1 shows an inclusion constraint between SMS Transfer and Copy Media. That is, in order to receive a photo via SMS, this photo has to be copied in an album. The remainder of this paper is organized as follows. Section 2 briefly reviews the feature modeling concepts. In Section 3, the study is set and some characteristics of SPLOT and FeatureIDE are described. Section 4 reports and analyzes the results of this exploratory qualitative study. In Section 5, some threats to the study validity are discussed. Section 6 concludes this paper by summarizing its main contributions and pointing out directions for future work. II. SOFTWARE PRODUCT LINE B. Survey of Feature Model Notations Several modeling notations can be found in the literature [7] [12] [15] [19] [22] [37]. A feature model is an artifact generated during the domain analysis and used to describe a set of features and their relationships into the domain. A method for domain analysis is called Feature-Oriented Domain Analysis (FODA) proposed by Kang [22] which uses a graphical representation (feature model) to show the identified features. Large software companies have adopted SPL to develop their products [41]. SPL makes companies more competitive by providing large scale reuse with mass customization [28]. This section introduces SPL, focusing on feature model notations and tools. A. Feature Modeling Feature models are popular for representing variability in an SPL [13]. A feature model is a way to represent the space of possible configurations of all products in an SPL [2] [13]. It allows the visualization of hierarchical features and their relationships [14] [27]. Fig. 1 shows a feature model of a system, called MobileMedia [17]. Nodes in this figure represent features and edges show relationships between them. A single root node, MobileMedia, represents the concept of the domain being modeled. Some FODA’s extensions can be found in the literature, such as, (i) Feature-Oriented Reuse Method (FORM) [22], (ii) Generative Programming Feature Tree (GPFT) [12], (iii) Van Gurp and Bosch Feature Diagram (VBFD) [37], (iv) Variability Feature Diagram (VFD) [7], (v) Product Line Use Case Modeling for System and Software engineering (PLUSS) [15], (vi) FeatuRSEB (combination of FODA and Reuse-Driven Software Engineering Business) [19], (vii) UML-Based Feature Models (UML-BFM) [14], and (viii) Integrating Feature Modeling into UML (IFM-UML) [38]. These extensions changed how the features and their relationships are depicted. For instance, features are represented with a rectangle around its name in FORM. An OR decomposition was added and the representation of XOR decomposition was modified in FeatuRSEB. UML-BFM and IFM-UML extensions use UML notation. Features in a feature model can be classified as mandatory, optional, and alternative. Optional features are represented with an empty circle, such as Receive Photo in Fig. 1. They may or may not be part of a product. On the other hand, mandatory features, such as Media Management, are represented by filled circles and are part of all SPL products containing their parent feature. Alternative features may be exclusive (XOR) or not exclusive (OR). The former indicates that only one sub-feature can be selected from the alternatives. For example, Screen 1, Screen 2 and Screen 3 in Fig. 1 are alternative features for Screen Size. OR features, such as Photo and Music, allow the selection of more than one option for a product. In addition to graphical representation, variability in SPL can be represented also by text-based models. These models use structured text to describe features and its relationships. Examples of the languages used to write text-based models are: (i) Textual Variability Language (TVL) [8]; (ii) Clafer [4]; (iii) GUIDSL [3]; 37 and iv) SXFM [26]. In Section V, we discuss the notations used in SPLOT and FeatureIDE. such as Enterprise Architect and Rational Rhapsody from IBM. C. Feature Modeling Tools Since feature models are undergoing a rapid process of maturation, feature modeling tools are constantly being developed and adopted in practice. In order to assist modeling and management of SPL, several tools are already available, such as SPLOT [26], FeatureIDE [23], XFeature [39], FMP [18], and Pure::Variants [29]. We performed a survey of tools for variability management before choosing SPLOT [26] and FeatureIDE [23] as representatives. This section presents a brief overview of five tools for variability management in SPL. III. STUDY SETTINGS This section presents the study configuration aiming to evaluate two alternative feature modeling tools, namely SPLOT and FeatureIDE. Section III.A compares these tools and Section III.B summarizes the background information of participants that took part in this study. Section III.C explains the training session and tasks assigned to each participant. A. Selection of the Analyzed Tools SPLOT and FeatureIDE tools were selected for this study. We focus our analysis on both tools because these tools are mature and used in large software projects. Other tools, such as FMP and XFeature, are only academic prototypes and do not provide all functionalities available in professional tools. We also aimed to select mature, actively developed, and accessible tools in order to evaluate the state-of-the-art in feature modeling. Therefore, we also excluded proprietary tools, such as Pure::Variants, because proprietary tools could hinder some sorts of analyses. For instance, we do not have access to all the features of the tool. SPLOT [26] is a Web-based tool for creating feature models and product configuration. SPLOT does not provide means for generation or integration of code. At the tool website, we can find a repository of more than 200 feature models created by tool users over 3 years. This is a free and open source project. You can download the tool's code and also SPLAR (a Java library created by the authors to perform the analysis of feature models). It also provides a standalone tool version that can be installed in a private machine. There is also an interesting feature, called workflow configuration, which defines a flow for product configuration. By using this feature, many people interact with each other in a collaborative way to configure an SPL product. Through using both tools, the key features mentioned by participants in the study are summarizes in Table I. FeatureIDE [23] is a tool which widely covers the SPL development process. Besides having feature model editors and configuration of products, it is integrated with several programming and composition languages with a focus on development for reuse. FeatureIDE was developed to support both aspect-oriented [24] and feature oriented programming [3]. This tool is implemented as an Eclipse plugin and can be downloaded separately or in a package with all dependencies needed for implementation. TABLE I. FUNCTIONALITIES OF SPLOT AND FEATUREIDE. XFeature [39] is a modeling tool implemented as an Eclipse plugin whose main goal is to automate the modeling and configuration of reusable artifacts. Initially, the XFeature was created to assist the development of space applications. However, it currently supports the general development of SPL. Focused on creating models and meta-models, the XFeature is still in a proof of concept stage and is rarely used in the software industry. SPLOT FeatureIDE Feature model notation tree tree and diagram Integration with code Available online Repository features models Configuration workflow ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ B. Participants Participants involved in this study are 56 young developers (between 20 to 32 years) taking an advanced Software Engineering course spanning four consecutive semesters from 2011-1 to 2012-2. All participants are graduated or close to graduate since the course targets post-graduated MSc and PhD students. To avoid biasing the study results, each participant only took part in one study semester and only used one tool, either SPLOT or FeatureIDE. That is, only one tool was used in each semester as indicated in Table II. FeatureIDE was used by 27 participants being 6 in the first and 21 in the second semester. Additionally, SPLOT was used by 29 participants being 15 in the first and 14 in the second semester. Each participant worked individually to accomplish the study tasks (Section 3.3). Participants in the 1st semester are named T1-S01 to T1-S06; in the 2nd Similar to XFeature, FMP1 [18] is a tool implemented as an Eclipse plugin which focuses on the variability modeling of SPL. Developed at the University of Waterloo, it was supposed to be integrated with Rational Software Modeler (RSM) and Rational Software Architect (RSA). However, the project is currently discontinued. Developed by Pure Systems, a company specializing in the reuse-based development of software, Pure::Variants [29] is a mature tool for the development of SPL. It can be used not only as an Eclipse plugin, but it can also be integrated into some commercial tools, 1 Functionality Feature model editor Automated feature model analysis Interactive product configuration FMP stands for Feature Modeling Plugin 38 TABLE II. BACKGROUND OF PARTICIPANTS. Work Experience UML Design Java Programming # of Participants T1 (2011-1) T2 (2011-2) FeatureIDE S01, S03, S04, S02, S04, S05, S07, S06 S14, S18 S01, S03, S04 - S02, S04, S05, S08, S06 S14, S18, S20 S01, S02, S04-S06, S01, S03, S04 S08-S12, S14, S15, S06 S17, S18, S20, S21 6 21 T3 (2012-1) T4 (2012-2) SPLOT S04, S07, S09, S10, S01-S03, S05S14, S15 S12 S03, S04, S07-S10, S02, S04, S05, S12, S14, S15 S08-S12 S03, S04, S07-S10, S12, S14, S15 15 No Answer T1: S02 T2: S13, S16 T3: S01, S02, S01-S03, S05S05, S06, S11, S12 S13 14 the automated analysis of feature models and Section 4.4 presents the results for the product configuration. Finally, Section 4.5 analyzes the impact of the different notations adopted by each tool. semester, T2-S01 to T2-S21; in the 3rd semester, T3S01 to T3-S15; and in the 4th semester, T4-S01 to T4S14. Before starting the experiment, we used a background questionnaire to acquire previous knowledge about the participants. Table II summarizes knowledge that participants claimed to have in the background questionnaire with respect to work experience, UML design, and Java Programming. Second, third, fourth and fifth columns in this table show the participants who claimed to have knowledge medium or high in a particular skill. Answering the questionnaire is not compulsory and participants who have not answered it are annotated in the last column (No Answer). However, although some cases participants who chose not to answer the questionnaire, we observe in Table II that, in general, all participants have at least basic knowledge in software development and technology. A. General Analysis Figures 2, 3, 4, and 5 show the most recurrent answers of participants grouped by category. Both FeatureIDE and SPLOT users like the respective interface of the used tool and, in both cases, in average 63% of participants made some positive comment about it. As indicated by Fig. 2, FeatureIDE also received positive comments about its presentation and usability, such as feature editor usability, reorganization of features, automatic organization, and the keyboard shortcut functionality. What draws our attention mostly, however, is that most participants said that contact with FeatureIDE was very short to proper evaluate it (Fig. 3). About 41% of participants said they could not write any negative comments about this tool. For instance, Participant T2-S19 states: “I believe that with a little more practice I would better understand the tool in order to criticize it”. Similarly, Participant T1-S106 states: “The tool has many features, but some take time to learn to handle. It can be confusing to programmers beginners”. Once the exposure time was the same for both FeatureIDE and SPLOT, we conclude that the FeatureIDE is a more complex tool for users at first glance. Consequently, it requires a longer learning time. C. Training Session and Tasks We conducted a 1.5 hour training session where we introduced participants not only to the analyzed tools but also to the basic concepts of feature modeling and SPL. The same training session (with the same content and instructor) was performed in all four groups (2011-1, 2011-2, 2012-1 and 2012-2). After the training session, we asked participants to perform some tasks using either SPLOT or FeatureIDE (see Table II). The tasks include using functionalities (i) to create and edit a feature model, (ii) to automatically analyze the feature model created and observe its statistics, and (iii) to configure a product relying on the tool product configuration functionality. Finally, we ask participants to answer a questionnaire with two simple questions about functionalities of the tool that they like and dislike. Questionnaire with open questions was used as an initial study to list the main functionality provided by the tools analyzed (Table I).We focus this paper on the most interesting results, but the questionnaire and all answers are available in the project website [16]. The whole study was performed in a computer laboratory with 25 equally configured equipments. The SPLOT users have been widely praised in relation to the automated analysis and display of statistics about feature models (Fig. 4). In fact, a major goal of SPLOT is to provide statistical information quickly during model creation and product configuration. On the other hand, several participants indicated usability-specific issues in this tool, such as inability to rearrange the features inside an existing model and difficulty in finding buttons (for instance) to save the model. These observations show that the tool is on the right path, but there are still many opportunities to improve its usability. Other issues, such as problems in setting up the model, confusing nomenclature, lacking of integration with others tools, lacking of privacy are also cited by some participants (Fig. 5). IV. RESULTS AND ANALYSIS B. Feature Model Editor Both FeatureIDE and SPLOT have a feature model editor (Figures 6 and 7). Interestingly, participants cited the feature model editor as a functionality they like in both tools (Figures 2 and 4). Focusing on FeatureIDE, its This section reports and discusses data of this qualitative study. Section 4.1 reports the general quantitative analysis based on answers of the participants. Section 4.2 focuses the discussion on the Feature Model Editor of each tool. Section 4.3 discusses 39 What do you like less in FeatureIDE tool? Interface/Usability Editor usability Confusing nomenclature Product configuration Lack of integration with tools Generating little code Not support different languages Brief contact with the tool Nothing, the FeatureIDE is perfect What do you like most in FeatureIDE tool? Simple and intuitive interface Editor usability Automatic organization Keyboard shortcuts functional Code generation % T1 and T2 0 10 20 30 40 50 60 70 Fig. 2. Positive comments in FeatureIDE. 0 5 10 15 20 25 30 35 40 45 50 Fig.3. Negative comments in FeatureIDE. What do you like most in SPLOT tool? What do you like less in SPLOT tool? Availability online Interface/Usability Editor usability Confusing nomenclature Product configuration Lack of integration with tools Lack of integration with code Sharing repository unsafe Nothing, the SPLOT is perfect Simple and intuitive interface Editor usability Automated analysis Product configuration Sharing repository % T3 and T4 % T1 and T2 0 10 20 30 40 50 60 Fig. 4. Positive comments in SPLOT. % T3 and T4 0 10 20 30 40 50 60 70 Fig. 5. Negative comments in SPLOT. editor allows users to create and delete features and constraints. Constraints are listed immediately below the feature model in this tool. Even less experienced participants think the editor interface of FeatureIDE is simple and easy to use. For instance, T2-S17 said that “Even without seeing the tutorial, I used it and performed all tasks right from the first time”. Similar comments were made by other participants, such as T2S21 who stated that “The graphical representation of the feature model is organized and facilitates visualizing the configuration space”. observed that “shortcuts make it easy to switch between mandatory, optional, and alternative features”. Following the same trend, Participant T2-S6 concludes that “shortcuts helped to speed up the feature model creation”. Fig. 6. Feature Model Editor in FeatureIDE. Fig. 7. Feature Model Editor in SPLOT. About 57% of participants made positive comments about editor usability. Two positive functionalities were cited: namely automatic organization of features (30% commented on it) and shortcuts for the mostly used functions (12% commented on it). For instance, with respect to the automatic organization of features, Participant T1-S01 stated that “the tool allows a nice view of the feature model because when we insert new features it automatically adjusts spaces between boxes to keep everything on screen”. Participant T1-S01 also Although participants claimed that the interfaces of both tools are simple and easy to use, some of them pointed out issues with specific functionalities in the feature model editors. For instance, Participant T1-S02 complained about FeatureIDE that “the automatic organization of features in the feature model editor does not allow specific adjustments for better visualization”. In fact, Participant T1-S2 contradicts Participant T1-S01 (above) about automatic organization of features. However, we observed that, in fact, FeatureIDE users 40 like this functionality, but they also expect more control to disable it when necessary. Another typical complaint was that “an accidental double click in one feature causes it to change, for instance, from optional to mandatory feature” as observed by Participant T2-S7. Again, to solve this problem, users of FeatureIDE expect a way to enable and disable this shortcut for changing the feature type. Statistical information can be useful for software engineers who are modeling an SPL or configuring a product. For instance, the number of possible configurations is a valid indicator of the customization power of an SPL. Participants mention a great benefit that the tool brings by allowing automatic analysis of valid configurations. For instance, according to Participant T3-S05 It “allows you to view all products and validate a feature model”. This participant also mentions that “we can hardly see this statistics in the middle of several lines of code”. With respect to SPLOT, almost 60% of participants like the automatic feature model analysis functionality (Section 4.3). However, participants recurrently complained about the lack of support to restructure a feature model. For instance, Participant T3-S12 said that “there is no way to manually reorder features of a tree by dragging and dropping them”. A similar observation is made by Participant T3-S02: “one cannot create an OR or XOR group based on pre-existing features”. The Web interface of SPLOT also led some usability-related issues. One of these issues was raised by Participant T3-S3 saying that “The ENTER button does not work to save changes in a feature; you need to click outside a feature box to validate your changes”. Besides the Enter button, another participant also spots an issue with the Delete button. Participant T3-S3 stated that “a feature is not removed by pressing the delete button”. In other words, participants of this study observed that, in general, shortcuts work fine in FeatureIDE, but it is a weak aspect in SPLOT. C. Automated Feature Model Analysis It is notable that features models are continually increasing in size and complexity. Therefore, it is required automated support for product configuration and verification of model consistency [6] [15] [27]. Both FeatureIDE and SPLOT offer different types of statistics as presented in Figures 8 and 9. Fig. 9. Statistical analysis in FeatureIDE. Fig. 8 shows a SPLOT screenshot with the feature model statistics. Among other information, SPLOT shows the number of dead features, if the model is consistent, and the number of valid configurations of the analyzed feature model. SPLOT also requires the user to save the feature model on its repository and use an URL to analyze the model. Participant T3-S13 points out this fact as a drawback by saying “if user does not generate the URL for the feature model, s/he cannot use the automated analysis and product configuration options”. SPLOT and FeatureIDE rely on BDD [9] and on a SAT [20] solver for statistical analysis. Fig. 8. Statistical analysis in SPLOT. 41 Fig. 9 presents statistical information in FeatureIDE. FeatureIDE displays, for instance, (i) the number of features added and removed after changes in the product configuration, and (ii) the number of concrete, abstract, primitive, compound and hidden features. However, although both tools are mature enough, many of the participants who have used FeatureIDE reported great difficulty in finding some functions in a tool. One of the participants says “The interface FeatureIDE is not intuitive. The tool has many features, but some of them take time to learn how to use, which can be confusing for novice programmers”. ones and abstract features are used to structure a feature model, and they do not have any impact at implementation level. The abstract features can be used (i) to represent complex functions that cannot be implemented without first be factored out or (ii) to represent features not implementable. FeatureIDE also has the concept of hidden features. The hidden features are concrete or abstract ones that for some reason were hidden, but are present in the diagram. Hidden features should not be related to cross-tree constraints, because they will not be considered in the model validation. D. Product Configuration After creating a feature model, we asked participants to configure a product. Product configuration was performed in both tools by means of the “auto complete” functionality as shown in Figures 10 and 11, respectively. This functionality only allows instantiation of valid products. SPLOT allows a single product configuration to be created and it does not allow saving it in the repository (Fig. 10). On the other hand, FeatureIDE allows to create multiple configurations and to set the default one (Fig. 11). Although SPLOT has this disadvantage, it provides a table that contains the steps users should take to configure a product. Through this table, it is possible to have enhanced control over the feature choices. Participant T3-S04 stated that in SPLOT “the screen to setup a product is well developed with information to configure products generated on the fly”. Fig.11. Product Configuration in FeatureIDE. Dead features can be generated by a cross-tree constraint validation in FeatureIDE and SPLOT. Fig. 12 shows a dead feature in FeatureIDE. A dead feature is any mandatory, optional or alternative feature that is never part of any valid product in the SPL because a constraint in the feature model "removes" the feature from any product configuration (typically by means of constraint propagation). For example, since Video is always included in any product that constraint would force Music to be excluded from all products and therefore feature Music is dead. In a consistent feature model since the cross-tree constraint was removed, the root feature is always mandatory in both SPLOT and E. Feature Model Notation Two different types of features can be modeled in FeatureIDE [35]: (i) concrete (non-abstract); and (ii) abstract, but only concrete features are represented in SPLOT. The difference between both features is the first one to be mapped to at least one implementation artifact. In other words, both features can be mandatory or optional, but concrete features represent implementable Fig. 10. Product Configuration in SPLOT. 42 a feature have two mandatory sub-features. One may argue that cardinality should not be used when arcs have the same semantics. On the other hand, when greater precision is desired, cardinality can be useful. In fact, the cardinality increases expressiveness of the used notation [13]. FeatureIDE. However, the root is an abstract feature in FeatureIDE, because it is seen as an abstraction, something complex that treats the domain being modeled. In general, SPLOT and FeatureIDE can be fairly used to represent feature models since they have the main model elements required to model the domain. However, as observed by Participant T3-S4, SPLOT has a minor advantage “for allowing interactive configuration of a product using a unified and intuitive notation”. SPLOT has certain insecurity “due to the public availability of the model” said T3-S4 because “anyone can change the repository content (and harm feature models of others)” pinpointed T3-S2. It also has some limitations to generate code and communicate with other tools (no import/export function). FeatureIDE by presenting some additional notations (abstract and hidden feature for example) may seem a little confused for novice users. Participants T4-S4, S5, S8, S9, S10 and S14 express large difficulties with the notation. Participant T4-S8 reports that "the terms used by this tool are sometimes very confused". Fig. 12. Dead Feature in FeatureIDE. The feature model can be visualized as a directorylike tree of features in both tools. In addition to the standard visualization, FeatureIDE also allows the visualization of feature models as a graph diagram. FeatureIDE can import/export models in GUIDSL or SXFM [10] format while SPLOT relies only on SXFM as the standard file format. Both tools also allow feature modeling with the graphical editor. The model elements available and how they are depicted in SPLOT and FeatureIDE are different since they use different representations of feature diagrams. SPLOT shows the diagram in tree format and FeatureIDE has two representation ways: (i) diagram; and (ii) tree format. FeatureIDE and SPLOT also perform diagram validation/consistency analysis and show the number of (possible) valid configurations. V. RELATED WORK The study of Turnes et al. [36] conducted a comparative analysis of variability management techniques for SPL tool development in the context of the SPL Hephaestus tool. Hephaestus is developed in Haskell and originally aimed at managing variability in requirements, but which has evolved to handle variability in different kinds of artifacts. Unlike our study, this study was particularly suitable to use in functional languages. The graphical representation of the main elements of a feature model available in SPLOT and FeatureIDE is presented in Table III. Note that, OR and XOR decompositions are explicitly represented in SPLOT and FeatureIDE, but using different strategies. The cardinality is used to represent the OR and XOR decompositions in SPLOT, while these decompositions are represented by an arc in FeatureIDE. A similar relation to the AND decomposition is represented, when The work by Saratxag et al. [31] explores the lack of tool support to control the evolution and design of product families, based on an exhaustive variant analysis. In this paper, a product line tool chain is presented based on the analysis of current SPL tools and approaches. The main goal is to show the benefits of a combination of SPL tools in an industrial scenario. In TABLE III. MODELING ELEMENTS OF FEATURE DIAGRAMS. Element\Tool SPLOT Mandatory Feature Optional Feature Abstract Feature - Dead Feature Hidden Feature - Root (concept/ domain) Cross-Tree Constraint (requires) Cross-Tree Constraint (excludes) OR Decomposition XOR Decomposition 43 FeatureIDE tools evidence that should be further investigated in later controlled experiments. A related study was conducted by Simmonds et al. [33]. This study investigates the appropriateness of different approaches for modeling process variability. The aims to evaluate the supporting tools that can be incorporated within a tool chain for automating software process tailoring. For this, summarizes some of the main features of eight tools that support process variability modeling. A more recent study [32] demonstrates an industrial case study with the SPLOT tool. It shows how it is used to specify and analyze software process models that include variability. Although both studies assess the degree of usability SPLOT, they are not based on concrete experimental data. Construct validity reflects to what extent the operational measures that are studied really represent what the researcher have in mind and what is investigated according to the research questions [30]. Conclusion validity, on the other hand, concerns the relation between the treatments and the outcome of the experiment [34] [40]. These threats may have occurred in the formulation of the questionnaire or during the interpretation of the results by the researchers since our study is mainly qualitative. Due to this qualitative nature, data are not suitable for quantitative analysis and statistical tests. Questionnaire with open questions was used as a qualitative initial study to list the main functionality provided by the tools, and futurelly conduct quantitative studies and statistical tests. As far as we are concerned, this is the first experiment conducted to analyze and compare these tools. To minimize this treat, we cross-discuss all the experimental procedures. Basili [5] and Kitchenham [25] argue that qualitative studies play an important role in experimentation in software engineering. contrast, our individually. studies evaluate only SPL VI. THREATS TO VALIDITY The purpose of this study exploratory is to support SPL engineers choosing the feature modeling tool that best fits their needs. A key issue when performing this kind of experiment is the validity of the results. Questions one may seek to answer include: was the study designed and performed in a sound and controlled manner? To which domain can the results generalize? In this section, threats are analyzed. We discuss the study validity with respect to the four groups of common validity threats [40]: internal validity, external validity, constructs validity, and conclusion validity. VII. CONCLUSION SPL focuses on systematic reuse based on the composition of artifacts and domain modeling. SPLOT and FeatureIDE are tools to support SPL variability management. In this paper, these tools were qualitatively analyzed and some interesting results were presented and discussed. This analysis was based on an exploratory study, in which we investigate the strengths and weaknesses of each tool grounded by surveying 56 young developers. The results reported in this paper should support software engineers to choose one of these tools for feature modeling. Additionally, this study can also be used by developers and maintainers of SPLOT and FeatureIDE - and other feature modeling tools - to improve them based on the issues reported. External validity concerns the ability to generalize the results to other environments, such as to industry practices [40]. A major external validity can be the selected tools and participants. We choose two tools, among many available ones, and we cannot guarantee that our observations can be generalized to other tools. Similarly, we select data of all participants in each of the four groups evaluated were used and, therefore, we do not take into consideration the knowledge of the participants. For instance, previous experiences with SPL or with some of the tools used not were taken into consideration. Thus, we cannot generalize the results because these experiences could positively influence the results obtained. We are currently running additional rounds of the experiment, with greater control over the study, in order to increase our data set, as well as the external validity of the results. After carrying out the exploratory study, we analyze the data and discuss our main results. In fact, SPLOT and FeatureIDE are two very different tools, although both tools provide support for feature modeling. SPLOT is a tool to be readily accessible to those interested only on SPL modeling and analysis (before development) and FeatureIDE is a tool focused on integration with the development process. Models created by SPLOT are automatically stored into the tool repository available to all tool users. On the other hand, FeatureIDE supports different languages for SPL implementation, such as, AspectJ [24] and AHEAD [3]. However, when choosing one of the tools the need and purpose of use is one of the main factors to be taken into consideration. Internal validity of the experiment concerns the question whether the effect is caused by the independent variables (e.g. course period and level of knowledge) or by other factors [40]. In this sense, a limitation of this study concerns the absence of balancing the participants in groups according to their knowledge. It can be argued that the level of knowledge of some participant may not reflect the state of practice. To minimize this threat, we provide a 1.5 hour training session to introduce participants to the basic required knowledge and a questionnaire for help the better characterize the sample as a whole. Additionally, 1.5 hour training session may not have been enough for subjects that begin without much knowledge and being exposed for the first time to these tools. However, Basili [5] and Kitchenham [25] argue that even less experienced participants can help researchers to obtain preliminary, but still important In future work, results of this study can be used and extended in controlled experiment replications. Quantitative data analysis is planned to be further performed in controlled experiments. Participants can answer additional and specific questions about SPL and feature modeling or perform tasks that exercise interesting aspects of SPL development, such integration and testing of feature code. In addition, other feature modeling tools can be analyzed and compared to SPLOT and FeatureIDE. 44 [19] Griss, M. L.; Favaro, J.; d’Alessandro, M. “Integrating Feature Modeling with the RSEB”. In proceedings of the 5th International Conference on Software Reuse, 1998, pp. 76-85. [20] Janota M. “Do SAT Solvers Make Good Configurators?” In proceedings of the Workshop on Analyses of Software Product Lines (ASPL 2008) at SPLC, 2008, pp. 191-195. [21] Kang, K. Sajoong K.; Jaejoon L.; Kijoo K.; Euiseob S.; Moonhang H. “FORM: A Feature-Oriented Reuse Method with Domain-Specific Reference Architectures”. Annals of Software Engineering, 1998, pp. 143-168. [22] Kang, K. C.; Cohen, S. G.; Hess, J. A.; Novak, W. E.; Peterson; A. S. “Feature Oriented Domain Analysis (FODA) Feasibility Study”. Technical Report CMU/SEI-90-TR-021. Software Engineering Institute. 155p. Accessed: August, 1990, 2012. Available: http://www.sei.cmu.edu/reports/90tr021.pdf. [23] Kastner, C. Sch. of Comput. Sci., Thum, T. ; Saake, G. ; Feigenspan, J. ; Leich, T. ; Wielgorz, F. ; Apel, S. “FeatureIDE: A Tool Framework for Feature-Oriented Software Development”. In proceedings of the Int'l Conf. on Software Engineering (ICSE), 2009, pp. 611-614. [24] Kiczales G.; Hilsdale E.; Hugunin J.; Kersten M.; Palm J.; Griswold W. G.; “An Overview of AspectJ”. In proceedings of the European Conference on Object-Oriented Programming (ECOOP), 2001, pp. 327-354. [25] Kitchenham, B.A. et al. “Preliminary Guidelines for Empirical Research in Software Engineering”. Transactions on Software Engineering, 28, 8, 2002, pp. 721-734. [26] Mendonça, M., Branco, M., Cowan, D. “SPLOT - Software Product Lines Online Tools”. In proceedings of the International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA), 2009, pp. 761-762. [27] Mendonça, M., Wasowski, A., and Czarnecki, K. “SAT-Based Analysis of Feature Models is Easy”. In proceedings of the Int'l Software Product Line Conference (SPLC), 2009, pp. 231-240. [28] Pohl, K., Bockle, G., and Linden, F. “Software Product Line Engineering: Foundations, Principles and Techniques”. Springer, 2005, 494p. [29] Pure.Systems: http://www.pure-systems.com/Home.142.0.html. 20/06/2012. [30] Runeson P. and Host M. “Guidelines for Conducting and Reporting Case Study Research in Software Engineering”. Empirical Software Eng., 2009, pp. 131–164. [31] Saratxaga, C. L., et al. “Product line tool-chain: variability in critical systems”. Product Line Approaches in Soft. Engineering (PLEASE), 3rd International Workshop on. IEEE, 2012. [32] Simmonds J. et al. “Modeling Variability in Software Process Models”. Computer Science Department, Universidad de Chile, Santiago, Chile. 2012. [33] Simmonds, Jocelyn, et al. “Analyzing Methodologies and Tools for Specifying Variability in Software Processes”. Computer Science Depart., Universidad de Chile, Santiago, Chile. 2011. [34] Thelin T., Runeson P., Wohlin C., Olsson T., Andersson C. “Evaluation of Usage-Based Reading - Conclusions After Three Experiments”. Empirical Soft. Engineering, 2004, pp. 77-110. [35] Thum, T., Kastner, C., Erdweg, S., and Siegmund, N. “Abstract Features in Feature Modeling”. In proceedings of the Int'l Software Product Line Conference (SPLC), 2011, pp. 191-200. [36] Turnes, Lucinéia, et al. “Techniques for Developing a Product Line of Product Line Tools: a Comparative Study”. Software Components, Architectures and Reuse (SBCARS), 2011 Fifth Brazilian Symposium on. IEEE, 2011. [37] Van Gurp, J.; Bosch, J.; Svahnberg, M. “On the Notion of Variability in Software Product Lines”. In proceedings of the Working IEEE/IFIP Conf. on Soft. Architecture. 2001, pp. 45. [38] Vranic, V.; Snirc, J. Integrating Feature Modeling into UML. In: NODe/GSEM, 2006, pp. 3-15. [39] XFeature: http://www.pnp-software.com/ XFeature/. 20/06/2012. [40] Wohlin C.; Runeson P.; Hst M.; Ohlsson M. C.; Regnell B.; Wessln As. “Experimentation in Software Engineering”, 2012, Springer. [41] Product Line Hall of Fame: http://splc.net/fame.html. ACKNOWLEDGMENT We would like to acknowledge CNPq: grants 312140/2012-6 and 485235/2011-0; and FAPEMIG: grants APQ-02376-11 and APQ-02532-12. Juliana is sponsored by FAPEMIG and Ramon is sponsored by CAPES. REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] Alferez, M.; Santos, J.; Moreira, A.; Garcia, A.; Kulesza, U.; Araujo, J.; Amaral, V. “Multi-view Composition Language for Software Product Line Requirements”. In proceedings of the 2nd International Conference on Software Language Engineering (SLE), 2009, pp. 103-122. Bagheri, E., Noia, T. D., Ragone, A., Gasevic, D. “Configuring Software Product Line Feature Models Based on Stakeholders’ Soft and Hard Requirements”. In proceedings of the Internat. Software Product Line Conference (SPLC), 2010, pp. 16-30. Batory, D. S. “Feature Models, Grammars, and Propositional Formulas”. In proceedings of the 9th International Conference on Software Product Lines, 2005, pp. 7-20. Bak, K.; Czarnecki, K.; Wasowski, A. “Feature and MetaModels in Clafer: Mixed, Specialized, and Coupled”. In proceedings of the 3rd International Conference on Software Language Engineering, 2010, pp. 102-122. Basili V., Shull F., and Lanubile F. “Building Knowledge through Families of Experiments”. Transactions on Software Engineering. 25(4), 1999, pp. 456–473. Benavides, D., Trinidad, P., Ruiz-Cortes, A. “Automated Reasoning on Feature Models”. In proceedings of the International Conference on Advanced Information Systems Engineering (CAiSE), 2005, pp. 491-503. Bontemps, Y; Heymans, P.; Schobbens, P.; Trigaux, J. “Semantics of FODA Feature Diagrams”. In proceedings of the Workshop on Software Variability Management for Product Derivation (Towards Tool Support), 2004, pp. 48-58. Boucher, Q.; Classen, A.; Faber, P.; Heymans, P. “Introducing TVL, a Text-Based Feature Modelling Language”. In proceedings of the 4th International Workshop on Variability Modelling of Software-intensive Systems, 2010, pp. 159-162. Bryant R. “Graph-based Algorithms for Boolean Function Manipulation”. IEEE Transact. on Comp., 1986, pp. 677–691. Chae W. and Hinrichs T. “SMARTFORM: A Web-based Feature Configuration Tool”. In proceedings of the Fourth International Workshop on Variability Modelling of Softwareintensive Systems (VAMOS), Linz, Austria, 2010. Clements, P., Northrop, L. “Software Product Lines: Practices and Patterns”. Addison-Wesley Professional, 2002. Czarnecki, K.; Eisenecker, U. “Generat. Program. Methods, Tools, and Applications”. Addison-Wesley, 2000, 864p. Czarnecki, K., Helsen, S., and Eisenecker, U. “Formalizing Cardinality-based Feature Models and Their Specialization”. Software Process: Improvement and Practice, vol. 10, issue 1, 2005, pp. 7-29. Dolog, P., Nejdl, W. “Using UML-Based Feature Models and UML Collaboration Diagrams to Information Modelling for Web-Based Applications”. In proceedings of the International Conference of UML, 2004, pp. 425-439. Eriksson, M, Borstler. J., Borg, K. “The PLUSS Approach Domain Modeling with Features, Use Cases and Use Case Realizations”. In proceedings of the International Software Product Line Conference (SPLC), 2005, pp. 33-44. Exploratory Study Data: http://www.dcc.ufmg.br/~juliana.pereira/spl_study. 15/01/2013. Figueiredo, E. et al. “Evolving Software Product Lines with Aspects: An Empirical Study”. In proceedings of the Int'l Conf. on Software Engineering (ICSE).2008, pp. 261-270. Franklin Machine Products: http://www.fmponline.com/. 20/06/2012. 45 Evaluating the Propagation of Exceptions in the Service Oriented Architecture in .NET José Alex, Eliezio Soares, José Sueney, Nélio Cacho, Roberta Coelho, Umberto Costa Departamento de Informática e Matemática Aplicada Universidade Federal do Rio Grande do Norte (UFRN) Caixa Postal 59078-970– Natal – RN – Brasil {alexmed,elieziosoares,jsueney}@ppgsc.ufrn.br, {neliocacho,roberta,umberto}@dimap.ufrn.br de reagir apropriadamente diante da ocorrência de exceções, continuando ou interrompendo sua execução, a fim de preservar a integridade do estado do sistema [27]. Abstract—High dependability, availability and fault-tolerance are open problems in Service-Oriented Architecture (SOA). The possibility of generating software applications by integrating services from heterogeneous domains, in a seamless way, makes worthwhile to face the challenges inherent to this paradigm. In order to ensure quality in service compositions, some research efforts propose the adoption of verification techniques to identify and correct errors. In this context, exception handling is a powerful mechanism to increase SOA quality. Several research works are concerned with mechanisms for exception propagation on web services, implemented in many languages and frameworks. However, to the extent of our knowledge no work evaluates these mechanisms in SOA with regard to the .NET framework. The main contribution of this paper is the evaluation of exception propagation mechanisms in SOA applications developed within the .NET framework. O tratamento de exceções é um dos mecanismos mais utilizados para implementar sistemas robustos [3]. Ele está embutido na maioria das linguagens de programação, tais como Java, C# e C++. Essas linguagens oferecem abstrações para encapsular as condições excepcionais e construções próprias para lidar com elas durante a execução do programa, tanto na detecção quanto no tratamento destas condições. O desenvolvimento de aplicações robustas baseadas em serviços web pode se beneficiar dos mecanismos de tratamento de exceções. Neste contexto, exceções podem ser lançadas por quaisquer dos serviços que compõem uma aplicação. De acordo com Gorbenko et al [3], conhecer as causas exatas e os elementos sinalizadores das exceções geradas durante a execução de serviços web permite que os desenvolvedores apliquem as técnicas adequadas de tratamento de exceções. Keywords—SOA; Fault Injection; Exception Propagation I. INTRODUÇÃO A Arquitetura Orientada a Serviços (SOA - do inglês: Service-Oriented Architecture) consiste em um estilo arquitetural que utiliza serviços como elementos básicos no desenvolvimento de aplicações. Essas aplicações são caracterizadas por serem fracamente acopladas e integrarem ambientes heterogêneos [17, 19]. Serviços web consistem em uma implementação da arquitetura SOA baseada em padrões da Internet [19], tais como: (i) o protocolo de comunicação HTTP; (ii) o protocolo SOAP (Simple Object Access Protocol), utilizado para troca de informações estruturadas de forma descentralizada e distribuída; (iii) a linguagem WSDL (Web Service Description Language), utilizada para descrever serviços, definir interfaces e mecanismos de interação; e (iv) o serviço de registro UDDI (Universal Description, Discovery and Integration), usado para descrição, descoberta e integração de serviços. Porém, para que os reais benefícios da adoção dos mecanismos de tratamento de exceções possam ser alcançados no contexto de aplicações SOA, experimentos precisam ser realizados para responder às seguintes questões: Como as exceções estão fluindo nos serviços web? Elas realmente podem contribuir para melhorar a robustez dos sistemas? Alguns trabalhos [3, 13, 14, 15] vêm sendo propostos com o objetivo de responder a essas e outras perguntas. Em [3] os mecanismos de propagação de exceções de dois toolkits de desenvolvimento de serviços web usados para desenvolver aplicações na linguagem Java foram analisados. Em [13] são exibidas as funcionalidades (camadas) necessárias para construir uma aplicação orientada a serviços e como essas diferentes camadas são mapeadas na arquitetura do aplicativo. Apesar de o trabalho classificar as camadas de acordo com as falhas que nelas ocorrem, não é realizado um estudo experimental do desempenho das aplicações quando essas exceções são geradas. De forma similar aos trabalhos apresentados em [13, 14, 15], existem classificações para as falhas nos serviços web, mas o desempenho dessas nas aplicações não é analisado através de um estudo experimental. Todavia, apesar de trazer vários benefícios ao desenvolvimento de software, este novo paradigma apresenta novos desafios e riscos ao desenvolvimento de aplicações robustas. Desenvolvedores de sistemas robustos frequentemente se referem a falhas como exceções porque falhas raramente se manifestam durante a atividade normal do sistema [26]. Em situações de falta, um componente gera exceções que modelam a condição de falta e o sistema deve realizar o tratamento daquelas exceções. Desta forma, tratamento de exceções é a capacidade que um software possui Neste contexto, o trabalho apresentado neste artigo estende o estudo descrito em Gorbenko et al [3] para uma plataforma ainda não analisada, a plataforma .NET da Microsoft. Como 46 dito por [28], a replicação é um dos elementos principais da experimentação e serve para aumentar a validade e a confiabilidade dos experimentos realizados. Segundo [24] e [25], a plataforma .NET é utilizada por 20% dos sites web conhecidos, principalmente em sites de negócio, compras e tecnologia, o que corresponde a 8.036.491 sites. Deste total, 1.485.430 sites estão entre os mais visitados na internet. O objetivo deste estudo foi analisar os mecanismos de propagação de exceções para verificar a confiabilidade em uma arquitetura orientada a serviços na plataforma .NET. Para atingir este objetivo, técnicas de injeção de faltas [23] foram utilizadas para simular as falhas propostas e analisar o comportamento do serviço web e da aplicação cliente no que diz respeito aos mecanismos de propagação de exceções. Algumas observações resultantes deste estudo são mostradas a seguir: • Apenas 22% das exceções propagadas identificaram as faltas injetadas. • Das 18 mensagens propagadas com as exceções, 14 não notificaram a causa raiz da falta injetada (este total corresponde a 78% das falhas analisadas). • Conhecer as características da propagação da exceção de cada falta simulada possibilita melhorar o tratamento especializado de falhas em arquiteturas orientadas a serviço. são fornecidos pontos de extensibilidade onde alguma funcionalidade pode ser adicionada. Esses pontos de extensibilidade podem ser utilizados para implementar uma grande variedade de comportamentos personalizados [12], como por exemplo: inclusão de mensagens ou parâmetros de validação, registro e transformações de mensagens, serialização personalizada, definição de formatos de desserialização, definição de cache de saída, agrupamento de objetos, tratamentos de falhas e de autorização. Fig. 1. Arquitetura WCF em execução Neste trabalho, a plataforma .NET e suas facilidades serão utilizadas como base para analisar os mecanismos de propagação de exceção em aplicações SOA, de acordo com o modelo de falhas apresentado na seção III. O Microsoft Visual Studio 2010 será utilizado como IDE de desenvolvimento. O restante deste artigo está organizado da seguinte forma: na Seção II são apresentados alguns conceitos básicos para o entendimento deste trabalho; a Seção III apresenta o modelo de falhas para aplicações orientadas a serviços considerados neste trabalho; na Seção IV é descrito o estudo realizado com o objetivo de analisar os mecanismos de propagação do fluxo de exceções em aplicações SOA na plataforma .NET; a Seção V discute os resultados do estudo; a Seção VI apresenta e discute trabalhos relacionados; e, por fim, a Seção VII apresenta conclusões e discussões sobre o presente trabalho. III. MODELO DE FALHAS Ao longo dos últimos anos, alguns modelos de falhas foram propostos com o objetivo de identificar tipos de falhas que ocorrem em serviços web. Estes modelos têm auxiliado na realização de testes de robustez, onde falhas são simuladas através de injeção [13, 23], seja em tempo de compilação ou em tempo de execução do serviço web. II. ARQUITETURA ORIENTADA A SERVIÇOS EM .NET A plataforma .NET tem a capacidade de executar diversas linguagens de programação, integrando-as como se fossem apenas uma. Através de um ambiente de execução, chamado CLR (Common Language Runtime), a plataforma .NET agrupa e gerencia todas essas linguagens, facilitando a manutenção e o desenvolvimento ao oferecer diversos serviços comuns às aplicações. No sentido de facilitar o entendimento da origem de cada falha, neste trabalho utilizamos a classificação proposta por [13] para organizar as falhas em serviços web em cinco camadas funcionais de acordo com o tráfego das mensagens entre as mesmas. Na Fig. 2. são apresentadas essas camadas divididas em camadas de componentes, qualidade de serviço, mensagens, transporte e comunicação, bem como o tráfego das mensagens entre elas. Note que existe uma correspondência entre a arquitetura de execução do WCF apresentada na seção II e a classificação em camadas definida por [13]. Na plataforma .NET, SOA é sinônimo de WCF (Windows Communication Foundation), a solução da Microsoft para o desenvolvimento de aplicações que se intercomunicam [12]. Essencialmente, o WCF descreve um fluxo de execução conforme descrito na Fig. 1. Tal fluxo é formado por um componente chamado distribuidor (no contexto do servidor) e um componente chamado de proxy (no contexto cliente) que são os responsáveis por fazerem a “tradução” das mensagens dos objetos do WCF para os demais métodos do framework .NET . A Fig. 1. ilustra como as mensagens seguem uma seqüência bem definida de etapas para realizar este processo, semelhante ao que é proposto em [13] e que será apresentado na seção III deste trabalho. Em cada passo ao longo do fluxo, 47 Note que as mensagens que descrevem as falhas analisadas não foram traduzidas para manter a consistência com as falhas propostas em Gorbenko et al [3] e em virtude das mensagens de exceções propagadas para o programador também serem em inglês. B. Falhas na camada de transporte As falhas desta camada são identificadas por uma falha de HTTP. Nota-se que, em contraste com a camada de comunicação, falhas nesta camada ocorrem na aplicação servidor e são, então, propagadas de volta para o remetente ao longo dos pontos L, M e N da Fig. 2. [13]. As falhas abaixo foram analisadas para essa camada e ocorrem nos pontos G e K da Fig. 2.: Fig. 2. Ilustração do tráfego das mensagens através das camadas inspirado em [13] As falhas propostas em Gorbenko et al [3], que são analisadas nesse trabalho, e os critérios definidos para a separação das camadas descritas em [13] serão explicados nas subseções seguintes. A. Falhas na camada de comunicação Falhas podem ocorrer no lado da aplicação cliente e no lado da aplicação servidor quando há perda de pacotes no tráfego entre a camada de transporte e a camada de comunicação (ponto D, N, F e L da Fig. 2.). Falhas podem decorrer de problemas de conectividade quando o cliente não consegue estabelecer uma conexão com o servidor (ponto E e M da Fig. 2.). Falhas de integridade também podem ocorrer nos dados trocados entre as partes (ponto E ou M da Fig. 2.) [13]. Network Connection break-off: Alguns fatores podem causar esse tipo de falha, como problemas físicos (intempéries, quebra de equipamentos, problemas no cabeamento, etc.), ou fatores como queda de sinal, por exemplo (ponto E ou M da Fig. 2.). • Domain Name System Down: É a inoperância do protocolo de gerenciamento de nomes e domínios (DNS). Com isso, a comunicação fica comprometida pela impossibilidade de tradução de endereços nominais em endereços IPs (ponto E ou M da Fig. 2.). • Remote host unavailable: A comunicação não é concretizada pela indisponibilidade do hospedeiro do serviço (ponto E ou M da Fig. 2.). • Application server is down: Ocorre pela queda do servidor de aplicação, seja por motivos físicos, seja em razão do alto número de solicitações de serviços, ou por outros motivos, como o desligamento intencional do servidor, por exemplo (ponto E ou M da Fig. 2.). • Error in web service name: É a chamada a um nome de serviço web inexistente. • Error in service port name: Requisição ao serviço em uma porta não adequada, para a qual não há serviço disponível. • Error in service operation’s name: Chamada de um método inexistente do serviço web. • Error in name of input parameter: Falha causada pela configuração dos nomes dos parâmetros do serviço requisitado errados. As falhas dessa camada são geradas a partir do processamento de mensagens de acordo com o protocolo SOAP. A camada de mensagens inclui a funcionalidade de codificação de mensagens oriundas da camada de qualidade de serviço (ponto B da Fig. 2.), adicionando informações de endereçamento sobre o destino da mensagem no cabeçalho SOAP e as passa para a camada de transporte (ponto C da Fig. 2.) [13]. A falha a seguir foi analisada para essa camada: • Loss of request/response packet: Ocorre quando há perda de pacotes durante a comunicação (pontos D, N, E, M, F e L da Fig. 2.). • Error in target namespace: Falha causada pela configuração errada do namespace do serviço requisitado. C. Falhas na camada de mensagens As seguintes falhas foram analisadas para essa camada: • • WS style mismatching: Falha no estilo do serviço web, se dá pela má configuração na ligação, trocando de “document” para “Rpc”, ou vice-versa. D. Falhas na camada de qualidade de serviço As falhas nessa camada correspondem aos requisitos não funcionais dos serviços, como, por exemplo, operações de segurança, confiabilidade e transações. No caso de nenhum componente da camada de qualidade de serviço ser utilizado, as mensagens são passadas a partir da camada de componente diretamente para a camada de mensagens, e dessa são transmitidas diretamente para a camada de componente (pontos B, P, H e J da Fig. 2.) [13]. A falha “Suspension of ws during transaction”, que ocorre quando há interrupção no provimento do serviço, durante uma 48 1) Seleção do contexto: A partir do endereço e porta gerados para acesso, foram realizadas várias alterações nos parâmetros de configuração, estado do serviço, tipos e valores dos parâmetros, a fim de simular as falhas e coletar métricas. transação entre o consumidor e o serviço foi explorada para a análise da falha nessa camada. E. Falhas na camada de componentes Na Fig. 3. é exemplificado em (1) uma classe .NET WSCalc, com o método chamado getMul() que será usado As regras de negócios, tipos e assinaturas dos métodos das aplicações são as origens das falhas nessa camada (pontos A, Q e I da Fig. 2.) [13]. As falhas listadas abaixo foram as responsáveis pelas análises feitas para o estudo das falhas nessa camada: • System run-time error: São falhas que ocorrem em tempo de execução, em nível de sistema, provocadas pelo código de execução do serviço web. Por exemplo, o estouro de pilha de execução, o acesso a posições inexistentes de vetores, ou falhas aritméticas de uma determinada operação do serviço. • Application run-time error ("Operand type mismatch"): São falhas geradas por código mal implementado, por exemplo, o uso de parâmetros considerados ilegais ou impróprios em determinado contexto, não coincidindo o parâmetro enviado e o esperado por uma função ou método. • Error Causing user-defined exception: Exceção implementada pelo serviço, estendendo um tipo de exceção, lançada por um método do serviço. • Input parameter type mismatch: Falha causada pela má configuração do tipo de parâmetro de entrada do serviço. • Output parameter type mismatch: Falha causada pela má configuração do parâmetro de retorno do serviço na ligação. • Mismatching of number of input params: Falha ocasionada pelo envio de uma quantidade incorreta de parâmetros. para simular as falhas que estão sendo analisadas. Em (2), essa classe estará no serviço web WSCalc, desenvolvido na IDE Visual Studio 2010 para .NET e hospedado no servidor web Internet Information Services (IIS), servidor de aplicação fornecido pela Microsoft responsável por gerenciar serviços e aplicações web, principalmente .NET. Para as injeções, desenvolveu-se um cliente em (3) com as especificações das faltas a serem analisadas e os códigos das injeções de faltas. Em (4), o cliente é executado e as injeções submetidas ao serviço web. As exceções propagadas serão analisadas em (5), primeiramente de forma comparativa de acordo com a classificação de falhas e, depois, em (6) de acordo com o desempenho que a respectiva exceção provoca no serviço web. Fig. 3. Simulação de Falhas com o WSCalc 2) Formulação de questões: Para que os reais benefícios possam ser avaliados, é tido como objetivo responder as seguintes perguntas, além dos questionamentos já citados anteriormente: Como as exceções estão fluindo nos serviços web? Elas realmente podem contribuir para uma melhor robustez dos sistemas? IV. CONFIGURAÇÃO DO EXPERIMENTO Na configuração do ambiente para o experimento, a máquina que respondeu como servidora possuía 2 cores Zion Xeon 3.2GHz (por core), com 5GB de RAM e sistema operacional Windows Server 2008 x64. A máquina utilizada como cliente foi uma Core i5 2.3 GHz, com 4GB de RAM e sistema operacional Windows 7 x64. 3) Seleção de faltas: As faltas foram selecionadas de acordo com o modelo de falhas apresentado na seção III. A seleção de faltas tem o objetivo de analisar cada uma das camadas da arquitetura de serviços web. A. Definição do Experimento 4) Projeto do estudo realizado: Para atingir o objetivo e responder os questionamentos, técnicas de injeção de faltas foram utilizadas para simular as falhas propostas na seção III e para analisar o comportamento do serviço web e da aplicação cliente no que diz respeito aos mecanismos de propagação de exceções. Utilizando a linguagem C#, foi desenvolvido um serviço web similar ao “WSCalc” apresentado em Gorbenko et al [3]. O objetivo foi analisar os mecanismos de propagação do fluxo de exceções em aplicações SOA, desenvolvidas com o framework .NET, através de uma extensão do estudo descrito em Gorbenko et al [3]. O objetivo deste experimento foi responder as seguintes questões: Como as falhas são reportadas? E como serviços web desenvolvidos em plataformas diferentes se comportam? C. Execução através da Injeção de Faltas Injeção de faltas é uma técnica utilizada para avaliar processo de validação de confiabilidade, sendo utilizada para auxiliar na localização de defeitos no software. Esta técnica B. Planejamento do Experimento 49 “Error in Web Service Name”, por sua vez, o nome correspondente ao método foi alterado por um nome de método inexistente na chamada do serviço web. Por fim, em “Error in Service Port Name”, mudou-se o número da porta de endereço do serviço web desejado. pode ser também utilizada para a verificação de segurança e análise de testabilidade de software [23]. Tendo o objetivo de aumentar e melhorar a cobertura de testes, a fim de fazer com que a execução do serviço web vá para estados que causem falhas no sistema, seja através da introdução de códigos ou de outros meios, a injeção de faltas possui alto índice de importância no desenvolvimento de softwares robustos. Determinadas falhas podem ser a causa para ocorrências de outras. Para a utilização dessa técnica é importante determinar os locais onde as mesmas podem ser injetadas. A simulação da falha “Error in service operations name” e das demais foram realizadas alterando os parâmetros de ligação no arquivo WSDL da aplicação cliente. D. Métricas Adotadas Baseado em Gorbenko et al [3], foram coletadas quatro métricas: número de stack traces, o tipo de exceção gerada, a mensagem excepcional retornada à aplicação cliente e o tempo contabilizado da chamada ao serviço no momento da captura da exceção. Algumas falhas podem ser simuladas com a queda de serviços. Por exemplo, a falha “Connection break-off” foi simulada com o desligamento da conexão de rede entre o cliente e o servidor durante a ligação. Na simulação de “Domain Name System Down” foi definido um endereço de serviço de DNS incorreto nas configurações do sistema operacional, e a aplicação cliente foi executada logo depois, ocasionando uma falha na resolução de endereço IP. O número de stack traces mostra o tamanho da pilha de chamadas, ou seja, o número de métodos por onde a exceção passou até chegar ao topo da pilha, até ser capturada por um bloco catch. Através dessa métrica é possível analisar o tamanho do caminho percorrido pela exceção até seu destino. O tipo da exceção é sua classe, o tipo de objeto que será capturado pelo bloco catch na aplicação cliente. A mensagem excepcional retornada é a mensagem gerada pelo próprio framework, sendo a exceção um objeto que possui muitos atributos. A mensagem é um dos atributos, que contém informações acerca da falha ocorrida. Por fim, o tempo gasto fornece uma noção do tempo necessário para a plataforma identificar e tratar uma falha. Na simulação de “Remote host unavailable”, a comunicação entre os hosts (cliente e servidor) foi interrompida com a desconexão da VPN (Virtual Private Connection). Seguindo a mesma linha de raciocínio de algumas simulações anteriores, é possível simular “Application server is down” com a inoperância do serviço IIS (Internet Information Services). Durante a simulação, o serviço IIS foi desligado no servidor. A falha “Suspension of ws during transaction” foi simulada por meio da inserção de trechos de códigos que atrasam a finalização do serviço, provendo tempo hábil para suspender o IIS manualmente, durante a execução da requisição. Na simulação das falhas, arquivos de log foram gerados em cada experimento, contendo: (i) o tipo de exceção gerada; (ii) a mensagem de exceção retornada; (iii) o número de stack traces (como também o estado da pilha com o caminho percorrido); (iv) o tempo de início e de fim a partir da geração da falha e (v) a descrição do método utilizado para simular a falha. Para simular “System run-time error” foi acrescentada uma operação de divisão dentro do método, e na passagem de parâmetros para o serviço web foi passado um valor zero para o denominador. Como divisões por zero não existem, a falha gerada pela execução dessa instrução foi, então, coletada. A falha “Application run-time error” foi simulada com a utilização de parâmetros ilegais. Inicialmente foi publicado um serviço web que recebia parâmetros do tipo String. O código do cliente foi implementado para invocar tal serviço. Após esse passo, o serviço web foi publicado novamente, agora recebendo parâmetros do tipo inteiro. Como a aplicação cliente não foi atualizada para a nova interface, a aplicação cliente invocou o serviço passando parâmetros com o tipo String. Isto acarretou em uma exceção, devido à conversão de uma palavra para inteiro. Apesar dos tipos de parâmetros ser irrelevante na interface do serviço, do ponto de vista do protocolo SOAP, que utiliza uma representação de dados essencialmente textual, o serviço internamente estava esperando um valor do tipo inteiro, por isso a exceção em questão foi propagada. A Tabela I detalha os tipos de exceção e as mensagens propagadas em cada uma das falhas testadas. As falhas que estão na cor cinza na Tabela I são as únicas falhas que tiveram as exceções propagadas com as mensagens exatas para a falha analisada. Na simulação de “Error causing user-defined exception”, um método foi criado no serviço web para lançar uma exceção implementada pelo próprio serviço. Em “Error in target name space”, alguns arquivos de configuração que continham parâmetros de namespace foram modificados diretamente. Em 50 TABELA I. INFORMAÇÕES DE ALTO NÍVEL LEVANTADAS PELAS FALHAS SIMULADAS Nº Descrição da Falha Tipo da Exceção EndpointNotFo undException Network connection break-off 2 Domain Name System Down FaultException 3 Loss of packet with client request or service response Remote Host Unavailable - 5 Application Server is Down EndpointNotFo undException 6 Suspension Of WS During Transaction Communication Exception 7 System Error Run-Time FaultException 8 Application run-time error ("Operand Type Mismatch") Error Causing userdefined exception FaultException 10 Error in Target Name Space FaultException 11 Error in Web Service Name EndPointNotFo undException 12 Error in Service Port Name EndPointNotFo undException 13 Error in service operations’s name FaultException 14 Output Parameter type mismatch Input Parameter Type Mismatch Error in name of input parameter Mismatching OfNumberOfInputPar ameters WS style mismatching (“Rpc” or “document”) - Server did not recognize the value of HTTP Header SOAPAction: http://tempur.org/getMulThrow. There was no endpoint listening at http://10.84.110.32:8081/wscal.asmx that could accept the message. This is often caused by an incorrect address or SOAP action. See InnerException, if present, for more details. There was no endpoint listening at http://10.84.110.32:8080/wscalc.asmx that could accept the message. This is often caused by an incorrect address or SOAP action. See InnerException, if present, for more details. Server did not recognize the value of HTTP Header SOAPAction: http://tempuri.org/getMul. OK – Correct output without exception - OK – Correct output without exception - Incorrect output without exception - Incorrect output without exception - OK – Correct output without exception 9 15 16 17 18 EndpointNotFo undException FaultException Para a análise das informações coletadas de propagação entre diferentes falhas e o desempenho da propagação dessas falhas, foram simuladas e injetadas as faltas descritas na seção IV, com exceção da falha “Loss of packet with client request or service response”. Mensagem da Exceção 1 4 V. RESULTADOS OBTIDOS There was no endpoint listening at http://10.84.110.32:8081/wscalc.asmx that could accept the message. This is often caused by an incorrect address or SOAP action. See InnerException, if present, for more details. Server did not recognize the value of HTTP Header SOAP Action: http://tempuri.org/getMul. - A. Análise de compatibilidade Como descrito na seção IV, foram realizados experimentos com um serviço web simples com métodos que realizam operações aritméticas. A Fig. 4. quantifica e sumariza as mensagens de retorno propagadas nas exceções das falhas analisadas e que serão explicadas em detalhes nas subseções seguintes. There was no endpoint listening at http://10.84.110.32:8081/wscalc.asmx that could accept the message. This is often caused by an incorrect address or SOAP action. See InnerException, if present, for more details. There was no endpoint listening at http://10.84.110.32:8081/wscalc.asmx that could accept the message. This is often caused by an incorrect address or SOAP action. See InnerException, if present, for more details. An error occurred while receiving the HTTP response to http://10.84.110.32:8081/wscalc.asmx. This could be due to the service endpoint binding not using the HTTP protocol. This could also be due to an HTTP request context being aborted by the server (possibly due to the service shutting down). See server logs for more details. Server was unable to process request. -->Attempted to divide by zero. Fig. 4. Gráfico quantitativo das mensagens propagadas Server was unable to process request. --> Input string was not in a correct format. Server was unable to process request. --> Exception throwed to test error 9. Propagation of an user defined exception. 1) Falhas compartilhando a mesma exceção e a mesma mensagem: Alguns grupos de exceções se formam por propagarem uma mesma informação acerca da falha. As seguintes falhas, por exemplo, “Network Connection breakoff”, “Remote host unavailable”, “Application server is down”, “Error in web service name” e “Error in service port name” propagam a mesma informação de falha (ver Tabela I). Dentre essas, todas, exceto a falha “Error in web service name”, propagam um mesmo tipo de exceção: “EndPointNotFoundException”; as falhas “Domain Name System Down”, “Error in target namespace” e “Error in service operations name” também propagam uma mesma mensagem e um mesmo tipo de exceção, dessa forma torna-se inviável responder a exata origem da falha propagada. Dentre as falhas agrupadas anteriormente, algumas podem ser diferenciadas por informações auxiliares embutidas na mensagem propagada. No primeiro grupo, as falhas “Error in web service name” e “Error in service port name” podem ser reconhecidas por um desenvolvedor atento, pois na mensagem propagada o framework .NET inclui o endereço cuja aplicação tentou acessar. Nessa informação consta o nome do serviço e a porta da requisição. Dessa forma, o desenvolvedor pode identificar se uma dessas duas informações está errada. Por exemplo, no experimento da falha “Network Connection break-off”, um trecho da mensagem é “There was no endpoint listening at http://10.84.110.32:8081/wscalc.asmx(...)”. Note 51 demanda tempo e reduz o desempenho do tratamento de exceções em uma arquitetura orientada a serviços. Esse caminho percorrido por uma exceção até a sua captura pode ser analisado por meio da sua “stack trace”, uma característica comum em mecanismos de tratamento de exceções de linguagens orientadas a objetos. A “stack trace” mostra o caminho por onde a exceção passou (classes, métodos, trechos específicos de código). que no endereço apresentado é possível identificar a porta “8081” e o nome do serviço “wscalc.asmx”. No segundo grupo, das falhas “Domain Name System Down”, “Error in target namespace” e “Error in service operation’s name”, é possível identificar se ocorre a falha “Error in target namespace” verificando o endereço da requisição disponível na mensagem propagada e checando a informação de namespace. Caso esse não esteja correto, podese inferir que ocorreu a falha “Error in target namespace”. No experimento “Domain Name System Down”, por exemplo, foi propagada a mensagem “Server did not recognize the value of HTTP Header SOAPAction: http://tempuri.org/getMul.”, onde “tempuri.org” é o namespace da requisição em questão. A Tabela II apresenta os resultados do experimento acerca da propagação de exceções e do desempenho (tempo). A tabela inclui o número de elementos na cadeia de propagação (“stack trace”) e o tempo de delay que representa o tempo contado antes da chamada até o momento da captura em um bloco catch. A primeira linha representa uma execução normal, sem falhas. 2) Falhas com Informações Exatas Acerca da Falha Ocorrida: A falha “Suspension of ws during transaction” retornou uma exceção do tipo “CommunicationException” e uma mensagem que descreve exatamente a falha provocada no experimento. A falha “System run-time error” retornou uma exceção do tipo “FaultException” que representa uma falha SOAP, sem representar uma falha específica, porém a mensagem retornada descreveu exatamente a falha simulada. A falha “Application run-time error ("Operand type mismatch")” também retornou uma exceção “FaultException”, mas a mensagem informa que a string estava em um formato incorreto. TABELA II. ANÁLISE DE DESEMPENHO DO MECANISMO DE PROPAGAÇÃO DE EXCEÇÕES 3) Falhas que não geraram exceções no sistema ou com retorno diferente de exceção: As falhas “Output parameter type mismatch”, “Input parameter type mismatch” e “WS style mismatching” não geraram exceções no sistema, todas retornaram o resultado esperado e não propagaram exceção. Especialmente as falhas “Output parameter type mismatch” e “Input parameter type mismatch” funcionam bem quando os valores enviados ou recebidos, mesmo não pertencendo ao tipo esperado, são compatíveis com uma conversão. Por exemplo: Caso o método do serviço web “getmul” exija dois parâmetros inteiros e a aplicação cliente forneça dois parâmetros do tipo string, o funcionamento dependerá do valor dessa string. Caso sejam números válidos como “15”, ou “2”, a requisição funcionará normalmente. Caso sejam valores como “15a” ou “2b”, ocorrerá uma falha de conversão que será propagada de forma semelhante à falha “Application runtime error ("Operand type mismatch")”. Nº Descrição da Falha 1 2 3 4 5 6 7 Without error / failure (Sem erro / falha) Network connection break-off Domain Name System Down Remote Host Unavailable Application Server is Down Suspension Of WS During Transaction System Run-Time Error 0 14 10 13 13 14 11 Delay da Propagação de Exceção, ms 46 25.924 6.000 21.329 27.024 14.378 169 8 11 221 9 Calculation run-time error ("Operand Type Mismatch") Error Causing user-defined exception 11 335 10 Error in Target Name Space 11 02.551 11 12 13 Error in Web Service Name Error in Service Port Name Error in service operations’s name 14 14 10 1.207 1.780 93.000 Nº Stack Traces A Tabela II mostra que as falhas “System run-time error”, “Application run-time error ("Operand type mismatch")” e “Error causing user-defined exception” são as falhas com menor tempo de espera até o retorno, isso se deve ao fato de que essas falhas não estão atreladas a problemas de rede ou ligação. Todas elas pertencem à categoria de falhas de serviço. A única exceção dessa categoria é a falha “Suspension of ws during transaction” que trata exatamente de uma suspensão do serviço, o que faz a aplicação tentar a conexão durante repetidas vezes até a falha se consolidar. A falha “Error Causing user-defined exception” não retornou o tipo da exceção criado pelo serviço, retornando por sua vez um tipo “FaultException”, contudo, a mensagem propagada foi a mesma mensagem configurada na exceção original, instanciada pelo serviço. Foram analisadas as informações da stack trace com o objetivo de identificar possíveis pontos de diferenciação entre falhas que propagam um mesmo tipo excepcional e mesma mensagem, como nos grupos descritos na sessão anterior. No segundo grupo, formado pelas falhas “Domain Name System Down”, “Error in target namespace” e “Error in service operation’s name”, é possível diferenciar a falha “Error in target namespace” pela quantidade de caminhos por onde ela foi propagada. Enquanto as falhas “Domain Name System Down” e “Error in service operation’s name” passaram por 10 caminhos, a falha “Error in target namespace” passa por 11, Nas falhas “Error in name of input parameter” e “Mismatching of number of input params”, nenhuma exceção foi lançada e o valor zero foi retornado, dificultando, portanto, o trabalho do desenvolvedor em prover robustez e confiança. B. Análise de desempenho Segundo Gorbenko et al [3], antes de uma exceção ser capturada pela aplicação cliente, essa mesma exceção pode ser encapsulada várias vezes anteriormente, durante seu lançamento ao longo de vários métodos. Esse processo 52 VI. TRABALHOS RELACIONADOS podendo ser um ponto de distinção entre as falhas, e possibilitando tratá-la especificamente. Existem muitos trabalhos envolvendo aplicações SOA e tratamento de exceções, porém a maior parte desses é dedicada a técnicas sistemáticas de geração de faltas, ataques e injeção de casos de teste, ou técnicas de detecção de falhas para testes de perda de pacotes de rede e corrupção de mensagens SOA. Poucos trabalhos focam no comportamento e propagação das exceções lançadas e no desempenho com relação a esses mecanismos. A falha “Loss of request/response packet” não está na tabela por não ter sido simulada. As falhas “Output parameter type mismatch”, “Input parameter type mismatch”, “Error in name of input parameter”, “Mismatching of number of input params”, “WS style mismatching” não estão na tabela, pois não retornaram nenhuma exceção. C. Comparando com soluções em Java Em [3], os mecanismos de propagação excepcional de dois toolkits de desenvolvimento de serviços web (Java cross platform desenvolvido pela SUN e o IBM WSDK) são analisados, bem como suas implicações no desempenho das aplicações. O objetivo do trabalho é analisar os mecanismos de propagação de exceções dos dois toolkits de desenvolvimento e entender suas implicações para o desempenho de aplicações SOA. No entanto, o mesmo não estende o estudo para outras plataformas como a do framework .NET. As mensagens de exceção propagadas em cada uma das falhas testadas coletadas nesse trabalho com as informações coletadas em Gorbenko et al [3] foram comparadas, de acordo com os tipos de mensagens atribuídas às falhas analisadas e apresentadas nas subseções anteriores. Apenas os resultados obtidos na análise de compatibilidade foram comparados, em virtude da ameaça à validade de uma possível comparação dos resultados obtidos na análise de desempenho, pois não é possível garantir que os experimentos foram submetidos a uma mesma configuração de hardware, o que afetaria em uma possível variação encontrada no desempenho entre as plataformas. Em [13], são exibidas as funcionalidades (camadas) necessárias para construir uma aplicação orientada a serviços e como essas diferentes camadas são mapeadas na arquitetura do aplicativo. Com esta descrição, são identificados os diferentes tipos de falhas que podem ocorrer e como classificá-las de acordo com a camada onde a falha ocorre, fornecendo uma visão geral de como elas são manipuladas e investiga como as aplicações se relacionam com os tipos de falhas. Em [14], é analisada a confiabilidade dos serviços web, injetando faltas em mensagens, classificando-as em falhas físicas, falhas de software, falhas de gerenciamento de recursos, falhas de comunicação e falhas do ciclo de vida. Já em [15], é proposta uma estrutura de arquitetura que tem capacidade de tratamento de exceção, garantindo a credibilidade do software orientado a serviços. Durante a sua fase de arquitetura são adicionados elementos de tratamento de exceção relacionando a arquitetura do software e a modelagem de processos de tratamento de exceção, permitindo uma separação clara de interesses entre a função de negócios e a unidade de tratamento de exceção, desempenhando assim um papel importante para alcançar sistemas orientados a serviços confiáveis. Tanto em [13], como em [14] e [15], não é realizado um estudo experimental do desempenho das aplicações quando essas exceções são geradas. A Tabela III mostra a relação entre os resultados obtidos na análise de serviços web desenvolvidos na plataforma Java [3] com os resultados obtidos por este trabalho. Os números na primeira coluna da Tabela III correspondem às descrições das falhas especificadas na Tabela I. Conforme é observado na Tabela III, 7 das 18 falhas analisadas possuem a mesma classificação de acordo com as mensagens de exceção propagadas, tanto nesse trabalho (plataforma .NET), quanto no trabalho de Gorbenko et al [3] (plataforma Java). O que corresponde a 39% das falhas analisadas. Com tal resultado, nota-se que serviços web desenvolvidos em plataformas diferentes comportam-se diferentemente em falhas semelhantes. TABELA III. COMPARAÇÃO COM OS RESULTADOS OBTIDOS EM [3] Nº 1 2 3 4 5 6 7 8 9 Tipo de Mensagem nesse trabalho (Plataforma .NET) Mesma exceção e mesma mensagem Mesma exceção e mesma mensagem Mesma exceção e mesma mensagem Mesma exceção e mesma mensagem informações exatas acerca da falha informações exatas acerca da falha informações exatas acerca da falha Não geraram exceção ou com retorno diferente 10 Mesma exceção mensagem e mesma 11 Mesma exceção mensagem e mesma 12 Mesma exceção mensagem e mesma 13 Mesma exceção e mesma mensagem Não geraram exceção ou com retorno diferente 14 15 16 17 18 Não geraram exceção ou com retorno diferente Não geraram exceção ou com retorno diferente Não geraram exceção ou com retorno diferente Não geraram exceção ou com retorno diferente Tipo de Mensagem em [3] (Plataforma Java) Mesma exceção e mesma mensagem Mesma exceção e mesma mensagem Mesma exceção e mesma mensagem Mesma exceção e mesma mensagem Mesma exceção e mesma mensagem informações exatas acerca da falha Mesma exceção e mesma mensagem informações exatas acerca da falha Não geraram exceção ou com retorno diferente Não geraram exceção ou com retorno diferente Não geraram exceção ou com retorno diferente Mesma exceção e mesma mensagem Não geraram exceção ou com retorno diferente Mesma exceção e mesma mensagem Não geraram exceção ou com retorno diferente Mesma exceção e mesma mensagem Mesma exceção e mesma mensagem Correspondência Igual Igual Igual Igual VII. Diferente Esse trabalho teve como objetivo estender a pesquisa realizada em Gorbenko et al [3], o qual realizou os mesmos experimentos na linguagem Java e para dois toolkits diferentes. O objetivo foi entender o quão efetivo são os mecanismos de propagação de exceções para o provimento de robustez em uma arquitetura orientada a serviços que utiliza a plataforma .NET. De acordo com a pesquisa realizada, as seguintes conclusões foram encontradas: Igual Diferente Diferente Diferente • Diferente Diferente Diferente Igual Diferente Igual Diferente Diferente CONCLUSÃO 53 A conclusão dada em Gorbenko et al [3], de que serviços web desenvolvidos em plataformas diferentes comportam-se diferentemente em falhas semelhantes foi confirmada. • Falhas relacionadas a problemas de rede ou de ligação com o serviço geram um tempo de espera até que a exceção seja capturada, muito superior a falhas de sistema ou aplicação. [8] • As mensagens propagadas com as exceções na maioria dos casos não informam a causa raiz da falha ocorrida. [9] • Conhecendo as características da propagação da exceção de cada falha simulada é possível melhorar o tratamento especializado de falhas em arquiteturas orientadas a serviço, ajudando assim o trabalho do desenvolvedor no provimento de uma solução robusta. [7] [10] [11] [12] É observado então, de forma sistemática, que a identificação da falha ocorrida pela mensagem de exceção propagada foi de apenas 22% das falhas analisadas. As conclusões acima indicam a necessidade do aperfeiçoamento dos mecanismos de propagação de exceções em SOA. Como trabalhos futuros, é pretendido expandir o estudo para outras tecnologias, como a REST, por exemplo, que tem crescido recentemente, bem como testar a abordagem proposta com serviços em aplicações comerciais. A semântica das exceções também é uma discussão importante a ser explorada em trabalhos futuros, apesar da análise realizada já apresentar alguns indícios sobre possíveis problemas. VIII. [13] [14] [15] [16] [17] [18] AGRADECIMENTOS [19] Agradecemos o apoio de todos que contribuíram para o desenvolvimento deste trabalho. Nélio é parcialmente apoiado pela FAPERN (PPP-III/79-2009) e pelo INES (CNPq 573964/2008-4). [20] [21] REFERÊNCIAS [1] [2] [3] [4] [5] [6] [22] Chan, P. P. W., Lyu, M. R. and Malek, M. (2006). Making Services Fault Tolerant. Proceeding ISAS'06 Proceedings of the Third international conference on Service Availability p. 43-61. Durães, J., Vieira, M. and Madeira, H. (2004). Dependability benchmarking of web-servers. Computer Safety Reliability and Security, p. 297-310. Gorbenko, A., Romanovsky, A., Kharchenko, V. and Mikhaylichenko, A. (2008). Experimenting with exception propagation mechanisms in service-oriented architecture. Proceedings of the 4th international workshop on Exception handling - WEH ’08, p. 1-7. Tartanoglu, F., Issarny, V., Chesnay, L. and Levy, N. (2003). Coordinated Forward Error Recovery for Composite Web Services. Published in SRDS (2003) p. 167-176. W3C, Web Services Architecture. 2004.http://www.w3.org/TR/2004/NOTE-ws-arch-20040211 Gorbenko,A.,Mikhaylichenko, A., Kharchenko,V.,Romanovsky,A. (2007). Experimenting With Exception Handling Mechanisms Of Web Services Implemented Using Different Development Kits. Proceeding WEH '08 Proceedings of the 4th international workshop on Exception handling p. 1-7. [23] [24] [25] [26] [27] [28] 54 Mattsson, M. "Object-Oriented Frameworks: A Survey of Methodological Issues", M.Sc. Dissertation, Department of Computer Science and Business Administration, University College of Karlskrona/Ronneby, LU-CS-96-197, 1996. Gamma, E., Helm, R., Johnson, R. & Vlissides, J. (1995). Design patterns: Elements of reusable object-oriented software. Publisher: Addison-Wesley Longman Publishing Co., Inc. Boston, MA, USA ©1995 ISBN:0-201-63361-2. FAYAD, M.; SCHMIDT, D.; JOHNSON, R. Application frameworks. In: Building Application Frameworks. New York:Wiley, 1999. p. 1-27. Pinto, J. M., & Moro, L. F. L. (2000). A planning model for petroleum refineries. Brazilian Journal of Chemical Engineering, 17(4–7), 575– 585. Sampaio, Cleuton; SOA e Web Services em Java/Cleuton Sampaio. Rio de Janeiro: Brasport, 2006. Low, Juval; Programming WCF Services. 2007. O'Reilly Media, Inc., Dec 17, 2008. Kopp, O.; Leymann F.; Wutke D. (2010). Fault Handling in the Web Service Stack. In: ICSOC 2010, Springer (2010). Looker, N., Munro, M., Xu, J.: Simulating errors in web services. International Journal of Simulation Systems 5(5) (2004) 2937. Qing Wu, Shi Ying, You Cong Ni, Hua Cui (2010). The Architecture Framework with Exception Handing in SOA. Applied Mechanics and Materials, 20-23, 992. David Martin, M. B. (September de 2007). Bringing Semantics to Web Services with OWL-S. World Wide Web, 10, pp. 243—277. Eric A. Marks, M. B. (2006). Service-Oriented Architecture: A Planning and Implementation Guide for Business and Technology. Wiley. Michael P. Papazoglou, P. T. (2007). Service-Oriented Computing:State of the Art and Research Challenges (Vol. 40). IEEE Computer Society. Mike P. Papazoglou, P. T. (2008). Service-Oriented Computing: a Research Roadmap. International Journal of Cooperative Information Systems, 17, pp. 223-255. R. Vaculín, K. S. (2007). Specifying and monitoring composite events for semantic web services. 5th IEEE European Conference on Web Services. IEEE Computer Society. Roman Vaculín, K. W. (2008). Exception Handling and Recovery of Semantic Web Services. Fourth International Conference on Networking and Services (pp. 217-222). Gosier, Guadeloupe: IEEE Computer Society. Sharma Chakravarthy, V. K.-K. (1994). Composite events for active databases: Semantics, contexts and detection. Twentieth International Conference on Very Large Databases, (pp. 606-617). Hossain M. S. (2006). Web Service Based Software Implemented Fault Injection. Information Technology Journal 5 (1): 138-143 ASP.NET Usage Statistics (Mar/2013) http://trends.builtwith.com/framework/ASP.NET Usage statistics and market share of ASP.NET for websites (Apr/2013) http://w3techs.com/technologies/details/pl-aspnet/all/all Goodenough, J. B. Exception handling: issues and a proposed notation. Commun. ACM 18, 12 (Dec. 1975), 683-696. Garcia, A.; Rubira, C; Romanovsky, A.; Xu, J. (2001) A Comparative Study of Exception Handling Mechanisms for Building Dependable Object-Oriented Software. In: Journal of Systems and Software, Elsevier, v. 59, n. 6, November, p. 197-222. O. S. Gomez, N. Juristo, and S. Vegas, Replications types in experimental disciplines, in ACM-IEEE ESEM 2010. A Systematic Mapping Study on Service Oriented Computing in the Context of Quality of Services Danilo Filgueira Mendonça, Genaı́na Nunes Rodrigues, Aletéia Favacho e Maristela Holanda Departamento de Ciência da Computação Universidade de Brası́lia, Brasil 70910-900 Email: [email protected], {genaina,aleteia,mholanda}@cic.unb.br Abstract— Background: In the last years, the field of service oriented computing (SOC) has received a growing interest from researchers and practitioners, particularly with respect to quality of service (QoS). Aim: This paper presents a mapping study to aggregate literature in this field in order to find trends and research opportunities regarding QoS in SOC. Method: Following well established mapping study protocol, we collected data from major digital libraries and analysed 364 papers aided by a tool developed for this purpose. Results: With respect to SOC contributions dealing with QoS properties, we were able to find out which SOC as well as which QoS facets are the focus of research. Our mapping was also able to identify those research groups that have mostly published in the context of our study. Conclusions: Most of the studies concentrate on runtime issues, such as monitoring and adaptation. Besides, an expressive amount of papers focused on metrics, computational models or languages for the context of Qos in SOC. Regarding quality attributes, a vast majority of the papers use generic models, so that the proposed solutions are independent of the particularities of a quality attribute. In spite of that, our study reveal that availability, performance and reliability were the major highlights. With respect to research type, many of the reviewed studies propose new solutions, instead of evaluating and validating existing proposals– a symptom of a field that still needs established research paradigms. Resumo— Motivação: Nos últimos anos, o campo da computação orientada a serviços (SOC) recebeu um crescente interesse de pesquisadores e praticantes, particularmente no que diz respeito a qualidade de serviços (QoS). Objetivo: Esse artigo apresenta um mapeamento sistemático para agrupar a literatura nesse campo, buscando identificar tendências e oportunidades de pesquisa relacionadas a QoS em SOC. Metodologia: Seguindo protocolo bem estabelecido de mapeamento sistemático, coletamos dados das principais bibliotecas digitais em Ciência da Computação e analisamos 364 artigos, com o auxı́lio de uma ferramenta implementada para o propósito desse estudo. Resultados: Com relação às contribuições de SOC que lidam com propriedades de QoS, identificamos quais propriedades de SOC, assim como aquelas de QoS estão no foco da pesquisa. Nosso mapeamento também identificou os grupos de pesquisa que mais tem publicado no contexto desse estudo. Conclusões: A maioria dos estudos analisados concentram-se em questões de tempo de execução, tais como monitoramento e adaptação. Mas também, houve expressiva parcela de trabalhos focados em métricas, modelos computacionais ou linguagens para o contexto de QoS em SOC. No tocante a atributos de qualidade, uma vasta maioria dos artigos utilizam modelos genéricos, de modo que as soluções propostas são independentes das particularidades de determinados atributos de qualidade. Não obstante, o estudo revela que disponibilidade, desempenho e confiabilidade foram os maiores destaques. A respeito do tipo de pesquisa, a maioria dos estudos avaliados propuseram novas soluções em vez de avaliarem ou validarem propostas existentes – um sintoma de um campo que ainda necessita de paradigmas de pesquisa estabelecidos. I. I NTRODUÇ ÃO A Computação Orientada a Serviços (ou Service Oriented Computing – SOC) emergiu como um novo paradigma de computação com intuito de prover maior eficiência à provisão e consumo de recursos computacionais, utilizando não somente serviços como componentes básicos para o desenvolvimento de aplicações, mas também componentes diversos de infraestrutura, tais quais servidores web e bancos de dados [1] [2]. Além disso, por meio da Arquitetura Orientada a Serviços (SOA), definiu-se diretrizes, objetivos e princı́pios que norteiam e organizam o desenvolvimento, manutenção e uso de serviços. Um dos objetivos em SOC é a automação e integração de processos de negócio ou cientı́ficos, que podem envolver diferentes organizações e domı́nios administrativos. Assim, por meio de componentes distribuı́dos, desacoplados, autônomos e auto descritos, SOC visa aumentar o reuso, a agilidade e o retorno de investimento da tecnologia da informação (TI) [3]. Dada as caracterı́ticas tipicamente distribuı́das, independentes e heterogêneas encontradas nos cenários aos quais SOC foi proposta, novas preocupações relacionadas à garantia de qualidade de serviço (Quality of Service — QoS) motivaram um interesse crescente dos pesquisadores na área de QoS em SOC. Quando novas áreas de pesquisa emergem, particularmente quando amparadas por certo apelo da indústria, é natural que uma quantidade significativa de contribuições cientı́ficas foquem primariamente em novas propostas de solução (fases que compreendem o momento da formulação de um paradigma por uma comunidade ainda restrita ao momento do seu refinamento e exploração por uma audiência mais ampla [4]). Por outro lado, com o amadurecimento da pesquisa realizada em uma área, espera-se o aprofundamento em termos de evidências relacionadas à aplicabilidade das técnicas pro- 55 postas, para que em seguida as mesmas sejam adotadas pela comunidade— Redwine et al. argumenta que esse ciclo de maturação dura aproximadamente 20 anos para a área de tecnologia, em particular para a área de software [4]. Existem na literatura trabalhos que apontam para potenciais de pesquisa relacionadas a confiabilidade em SOC [5]. Em uma pesquisa recém realizada por Ameller et al. [6], os autores evidenciam o fato que 68% dos arquitetos e projetistas de software participantes da pesquisa consideram igualmente importantes atributos de qualidade e funcionalidade dos sistemas de software baseados em serviços. Além disso, 71% dos participantes investem em tais atributos de forma explı́cita em seus respectivos artefatos. No entanto, até onde os autores deste artigo tem conhecimento, não existe um mapeamento que possibilite identificar o estágio da pesquisa realizada na área de qualidade de serviços em SOC de forma mais ampla, dificultando a identificação de tendências e oportundiades de pesquisa e tornando mais lenta a adoção das técnicas propostas, apesar da pesquisa em SOC ter emergido há pouco mais de 10 anos [1]. Uma exceção seria a tese de Hilari [7] que descreve uma revisão sistemática realizada no contexto de QoS em SOA, porém duas ressalvas são feitas com relação a este trabalho. A primeira é o escopo de sua revisão, verificado pela string de busca utilizada, que se restringe ao termo web service para representar toda a Arquitetura Orientada a Serviços. Outra diz respeito ao ano da revisão, de 2009. O presente mapeamento sistemático tem por objetivo um escopo mais amplo e atual do atual cenário de pesquisas no contexto de QoS em SOC, isto é, engloba todo o paradigma de orientação a serviços na computação com enfoque em QoS ao longo dos últimos 10 anos. Dado esse objetivo, a avaliação foi feita principalmente no contexto de estágio das pesquisas e na identificação dos seus principais grupos, assim como na identificação de quais atributos de QoS em SOC são mais investigados e em quais áreas de SOC houve maior quantidade de publicações. Nossa hipótese inicial é que o foco de pesquisas relativas a QoS em SOC esteja próximo do patamar de amadurecimento em que uma quantidade significativa de trabalhos objetiva evidenciar os benefı́cios das soluções propostas por meio de estudos experimentais e validações. A investigação de tal hipótese é feita apresentando os resultados de um mapeamento sistemático (MS) [8], cujo protocolo é descrito na Seção II, revelando indı́cios acerca da maturidade da pesquisa, grupos que atuam na área, focos de contribuição e quais sãos os principais atributos de qualidade investigados. Organizamos nosso mapeamento seguindo o protocolo definido na Seção II onde são definidas as questões de pesquisa. Para a realização colaborativa do mapeamento foi implementada uma ferramenta que é sucintamente explicada na Seção III. As questões de pesquisa são respondidas ao longo da Seção IV e uma discussão acerca dos dados obtidos é feita na Seção V. Finalmente, apresentamos as ameaças para a validade dos nossos resultados na Seção VI, bem como nossas considerações finais na Seção VII. II. M ÉTODO DO E STUDO Um mapeamento sistemático tem por objetivo classificar informações acerca de uma área de pesquisa de forma ampla que a tradicional revisão sistemática de estudos. Uma vez constatada a vasta quantidade de publicações no campo de SOC, escolheu-se esta metodologia para viabilizar a classificação dos artigos. A metodologia adotada seguiu as diretrizes propostas em [8], cujos passos necessários são descritos no restante desta seção. Um MS, assim como outras revisões literárias, estabelece o uso de um protocolo que documenta as etapas do mapeamento de modo a garantir sua replicação e diminuir um possı́vel viés por parte dos pesquisadores. Nele estão definidas as questões de pesquisa, os fóruns cientı́ficos onde as publicações são recuperadas, a string de busca utilizada, os critérios de inclusão e exclusão de artigos, além das facetas de classificação. A. Questões de Pesquisa As questões de pesquisa foram organizadas de acordo com a motivação desse estudo, que é investigar e categorizar as contribuições de pesquisa em Computação Orientada a Serviços no contexto de qualidade de serviço. Esse estudo tem como objetivo responder às seguintes perguntas: (1) QP1 Quais áreas de SOC são mais frequentemente pesquisadas no contexto de qualidade de serviços? (2) QP2 Quais atributos de qualidade são frequentemente considerados nos estudos abordados? (3) QP3 Quais são os grupos de pesquisa, no Brasil e no mundo, que mais publicam no contexto desse estudo? (4) QP4 Qual o foco da contribuição de pesquisa realizada? A QP1 tem como objetivo trazer uma perspectiva do cenário das pesquisas em Computação Orientada a Serviços com foco em QoS atualmente. Para responder a essa pergunta, primeiramente definimos quais são as áreas que melhor caracterizam as diversas contribuições de pesquisa em SOC. Nota-se a importância da contribuição dada na definição dessa faceta devido à escassez de referências que identifiquem as principais atividades envolvidas em SOC e que são alvos de pesquisa, tal qual a seleção, composição, monitoramento e adaptação de serviços, entre outras. Com relação à QP2, pretendemos obter com esse estudo quais são os atributos de QoS mais frequentemente explorados em SOC. Em outras palavras, considerando que QoS, nesse contexto, envolve atributos como disponbilidade, confiabilidade, desempenho, segurança, escalabilidade, custo e SLA, quais desses atributos estão de fato em foco. Com relação à QP3, pretendemos também identificar quais grupos de pesquisa, no Brasil e no mundo, mais publicaram no contexto desse estudo. Por fim, a QP4 almeja elucidar quais tipos de pesquisa são mais frequentes e inferir conclusões acerca da maturidade da pesquisa realizada na área. Vale ressaltar que no escopo deste artigo não pretendemos avaliar o mérito dos trabalhos estudados. 56 B. Estratégia de Busca Nossa estratégia de busca consistiu essencialmente na busca eletrônica nas seguintes bibliotecas digitais: ACM Digital Library, ScienceDirect, IEEE Xplore e SpringerLink, que estão entre as bibliotecas mais relevantes para o contexto da nossa pesquisa. Para formular os termos de busca para a base de dados eletrônica, usamos a abordagem sugerida por Kitchenham [9], [10]. A estrategia deriva os termos de busca a partir das questões de pesquisa usando uma composição com os operadores OR e AND. Inicialmente foram escolhidos os termos mais importantes, i.e., Quality of Service e Service Oriented Computing para o foco do trabalho e a partir desses foram derivados os sinônimos utilizados. A Tabela I apresenta a string de busca usada no nosso estudo. Para evitar a tendenciosidade sobre quais comunidades de pesquisa são as mais atuantes no nosso domı́nio de interesse, assim como obter um tamanho real do volume das contribuições, resolvemos não adotar técnicas como snow-balling onde outros trabalhos relacionados podem ser encontrados a partir das referências dos trabalhos extraı́dos automaticamente [10]. TABLE I T ERMOS DE B USCA UTILIZADOS PARA Também foram excluı́dos artigos que poderiam ser considerados como resumos estendidos, em geral, aqueles com número de pá ginas igual ou inferior a quatro. Por fim, o histograma da Figura 1 representa o número de publicações encontradas nas bibliotecas digitais consultadas, mostrando as primeiras ocorrências em 2002. Percebemos que houve uma decaı́da no ano de 2011, porém a curva retorna sua tendência já em 2012. Com base nisso, resolvemos fazer uma avaliação que compreendesse um perı́odo representativo para o mapeamento. Importante destacar que o ano de 2013 não foi considerado, pois até o momento da coleta não era possı́vel obter informações conclusivas sobre esse ano na data em que a classificação e análise dos demais anos foi executada. D. Esquema de Classificação PESQUISA DE PUBLICAÇ ÕES ((“web service” OR “web services” OR “service oriented” OR “service-oriented” OR SOA OR SaaS OR PaaS OR “service orientation” OR “service-oriented computing” OR “service oriented computing” OR SOC) AND (“quality of services” OR “quality of service” OR QOS)) A string de busca é validada pela verificação direta de que os termos escolhidos representam as questões de pesquisa, os dados extraı́dos serão capazes de responder as questões de pesquisa uma vez que contemplam às facetas de classificação e o procedimento de análise dos dados é apropriado. Por fim, os termos SaaS ePaaS foram adicionados para representarem trabalhos mais recentes com foco na computação em nuvem, respectivamente definidos como Software como Service e Plataforma como Serviço. Entende-se que ambas abordagens fazem uso de conceitos e/ou tecnologias relacionados ao SOC, portanto foram incluı́dos no contexto desse estudo. C. Critérios de Inclusão e Exclusão Para filtrar os artigos coletados, utilizamos os seguintes critérios para inclusão e exclusão. Incluı́mos apenas artigos publicados em workshops, congressos e periódicos nas bibliotecas digitais que satisfaziam nossa string de busca, conforme descrito na Seção II-B. Artigos considerados como gray literature, i.e. relatórios técnicos e white papers foram excluı́dos devido à grande quantidade de artigos cientı́ficos já considerados no escopo do mapeamento. No que tange às contribuições em SOC, foram consideradas somente aquelas que lidavam com nı́veis de abstração acima do sistema operacional, e.g. relativas a middleware ou plataformas de distribuição. Contribuições que lidavam com SOC, mas que não tratavam de nenhum aspecto de QoS foram excluı́das. 57 Os artigos foram classificados de acordo com as categorias: - Contribuição: Essa faceta classifca partes relevantes de SOC, de modo a agrupar trabalhos de pesquisa por subáreas de contribuição. Definir essa faceta foi um primeiro desafio para o grupo, pois não encontramos na literatura algum trabalho que defina de forma clara e concisa as dimensões que devem definir e constituir SOC. Considerando a experiência dos autores e depois de um vasto estudo da literatura, os autores definiram os seguintes atributos dessa faceta: composição, coordenação & comunicação, descoberta & seleção, ciclo de vida, monitoramento & adaptação e modelos de QoS & linguagens. Algumas categorias que haviam uma forte correlação foram classificadas em um grupo, como monitoramento & adaptação e coordenação & comunicação. Em particular, a categoria modelos de QoS & linguagens engloba publicações que definem extensões ou novos modelos de QoS por meio de modelos computacionais, linguagens, especificações e/ou ontologias a serem utilizadas em sistemas baseados em serviços para o suporte à analise, garantia ou gerenciamento de QoS. - Contexto: Essa faceta representa os atributos de qualidade de maior relevância para SOC, além das opções para os demais atributos não mencionados, atributos genéricos e referentes à SLA. São eles: disponibilidade, desempenho, confiabilidade, escalabilidade, segurança, custo, outros e SLA. O item SLA engloba trabalhos que não especificam quais atributos de qualidade em especı́fico estão tratando, sendo consideradas contribuições genéricas no contexto de QoS. - Pesquisa: A última faceta é usada para caracterizar o tipo de pesquisa realizada. Para definir essa faceta, utilizamos as definições em [11]: solução (artigos que propõem uma nova solução e que, ocasionalmente, utilizam um pequeno exemplo para verificarem a sua viabilidade), validação (artigos que apresentam estudos empı́ricos ou provas que corroboram a aplicabilidade de alguma técnica), avaliação (artigos que apresentam algum tipo de avaliação comparativa entre técnicas propostas e/ou existentes e discorrem sobre os benefı́cios e limitações num contexto em que já houve casos de uso reais) e Fig. 1. Quantidade de publicações relacionadas a QoS em SOC entre 2002 e 2012 coletadas experiência pessoal (quando os autores apresentam a experência prática do uso de alguma técnica ou discutem tendências de pesquisa sobre um tema especı́fico). E. Extração dos Dados e Mapeamento dos Estudos Inicialmente foram feitas consultas manuais em cada uma das bibliotecas digitais mencionadas na Seção II-B. Verificouse ao todo um número de 1239 publicações a serem analisadas. Para atender a essa quantidade significativa de publicações, a coleta dos resultados de busca foi automatizada por um minerador capaz de recuperar as publicações nas bibliotecas digitais e armazenar as mesmas em um banco de dados. Mais especificamente, o minerador da Figura 2 armazena os metadados das publicações resultantes das buscas nas diferentes bibliotecas. III. F ERRAMENTA DE M APEAMENTO C OLABORATIVO Com base na experiência de alguns dos autores deste artigo, que haviam observado a dificuldade em se trabalhar com revisões sistemáticas de forma colaborativa, decidiu-se pelo desenvolvimento de uma ferramenta de apoio para permitir a análise e classificação dos artigos com maior eficiência e ubiquidade de trabalho, capaz também de gerar resultados gráficos em tempo real. Esta ferramenta consiste em um ambiente disponı́vel em nuvem, com interfaces disponı́veis para a listagem das publicações coletadas automaticamente pelo minerador ou de forma manual pela interface de registro de novas publicações. Sua arquitetura está representada na Figura 2, com a seguinte descrição dos componentes: • • Minerador: Responsável pela coleta dos metadados das publicações a partir de uma determinada string de busca. Para tanto, efetua requisições HTTP/REST a servidores web de bibliotecas digitais. Há duas possibilidades de retorno tratáveis para as chamadas dependendo da biblioteca usada: HTML puro representando a própria página de resultado de buscas ou em padrão XML, sendo ambos interpretados e tratados para a extração dos metadados. Aplicação Web: Consiste na aplicação que utiliza os dados coletados pelo minerador com funcionalidades de criação, edição, listagem e uso de usuários, grupos, facetas de classificação, publicações, revisões e geração sı́ncrona de resultados. Sua implementação segue o modelo de desenvolvimento MVC com definições de entidades e relacionamentos genéricos para que o ambiente possa receber mapeamentos diversos que utilizem facetas de classificação criadas em tempo real. As Figuras 3, 4 e 5 ilustram, respectivamente, as interfaces de listagem e classificação de publicações e um exemplo de resultado gerado durante o mapeamento. • Base de Dados: Uma única base de dados como repositório de coleta de dados do minerador e das classificações e resultados obtidos por meio da aplicação web. Também no ambiente da ferramenta de mapeamento, cada publicação pode ser classificada utilizando uma interface apropriada que contém os metadados do artigo, campos de observações e marcações dos itens de classificação definidos para o MS em questão, conforme Figura 4. O uso dessa ferramenta foi de grande importância para a viabilidade do MS diante da quantidade inicial de publicações coletadas. Além disso, tal ferramenta permitiu a realização do mapeamento de forma colaborativa onde os artigos são compartilhados entre diferentes grupos de usuários em suas respectivas sessões autenticadas. Fig. 2. Arquitetura da Ferramenta de Mapeamento Colaborativo A partir da distribuição automática de artigos para cada um dos pesquisadores pela aplicação web, foram excluı́das manualmente as publicações que não se adequaram aos critérios definidos na Seção II-C, resultando em uma lista final de 364 artigos. Portanto, da quantidade inicial cole- 58 Fig. 3. Listagem das publicações aos critérios de inclusão, exclusão e facetas de classificação. IV. R ESULTADOS Nesta seção apresentamos os principais resultados do nosso estudo, agrupados pelas questões de pesquisa discutidas na seção anterior. A. Questão de Pesquisa 1 Quais as áreas de SOC são mais frequentemente pesquisadas no contexto de qualidade de serviços? Fig. 4. Fig. 5. Interface usada para mapeamento Exemplo de resultado gerado para a faceta de QoS (contexto) tada nas bases de busca, 75% foram considerados fora dos critérios adotados para o mapeamento. Tal processo, assim como a classificação, foram realizados manualmente e individualmente, tendo havido frequente discussão para eliminar quaisquer dúvidas e inconsistências de interpretações quanto Para responder a essa pergunta primeiramente identificamos as principais caracterı́sticas de SOC. Encontramos no projeto europeu S-Cube a fundamentação mais clara para definir tais caracterı́sticas [12] e, portanto, o adotamos como referência para definição da faceta da contribuição. Visando atender ao foco desse estudo, no entanto, adaptamos a estrutura definida no projeto S-Cube, uma vez definido que o presente MS não irá abranger estudos que tratem, em especı́fico, da infraestrutura baseada e serviços (IaaS). A Figura 6 apresenta o diagrama ilustrando a distribuição dos artigos no eixo horizontal de SOC, a faceta da contribuição. Assim como os resultados observados nas outras facetas, vale ressaltar que os atributos mapeados não são mutualmente excludentes, visto que representam partes de SOC. Por exemplo, o artigo [13] lida com composição, modelos de QoS & linguagens, além de monitoramento & adaptação. Existe dentre eles alguma sobreposição, notavelmente entre composição & coordenação. No entanto, notou-se que o termo coordenação também envolve aspectos da comunicação entre provedores e consumidores de serviços em geral, independente de composição de serviços. O resultado desse mapeamento mostra que a maior parte dos trabalhos publicados lida com modelos de QoS & linguagens (31.32% ), seguida de descoberta & seleção (30.77% ), monitoramento e adaptação (30.49% ), composição (28.85% ), ciclo de vida (13.46% ) e finalmente coordenação & comunicação com 8.24% . 59 SLA 20 67 22 75 65 62 Seguranca 10 ● 6 ● ● 1 7 ● ● 13 7 ● Outros 7 ● 8 ● 2 ● ● ● 8 ● 12 15 Quantidade Escalabilidade 9 ● 2 ● 8 ● 2 ● 3 ● 8 ● 20 40 Disponibilidade 23 28 7 ● 33 22 33 60 Desempenho 22 42 8 ● 38 39 44 Custo 4 ● 18 2 ● 18 ● 16 ● Confiabilidade 20 19 7 ● 25 30 33 Ciclo de vida Composicão Coordenacão e Comunicacão Descobrimento e Selecão Modelos de QoS e Linguagens Monitoramento e Adaptacão Fig. 6. 11 Bubble plot representando os tipos de contribuição e contexto(QoS) Esses resultados indicam o foco dado a aspectos não funcionais, dinâmicos e que podem sofrer variações devido a concorrência e possı́veis falhas dos serviços em tempo de execução. Uma preocupação que pode refletir problemas no modelo de terceirização de serviços. Uma outra observação: dado que o ambiente SOC possui caracterı́sticas próprias e diferenciadas, é natural que novas métricas de QoS tenham sido definidas ou que métricas já utilizadas tenham ganhado novos significados, assim como linguagens e especificações que admitam o tratamento e negociação dos requisitos de qualidade. O resultado obtido nesse estudo para o atributo de modelos de QoS e linguagens atesta essa constatação. Notase também que a descoberta e seleção de serviços foram bem representados nas pesquisas, assim como a composição. No primeiro grupo, é considerado o descobrimento e escolha entre serviços de mesma funcionalidade, porém com diferentes nı́veis de QoS. O segundo abrange não somente a composição, mas também a escolha da melhor configuração de modo a atender aos nı́veis globais desejáveis ou necessários de QoS para um conjunto de serviços. Por fim, notou-se um número reduzido de trabalhos que abordam ciclo de vida e menos expressivo ainda com relação a coordenação & comunicação. Com relação ao ciclo de vida, isso pode evidenciar a necessidade de contribuções que dêem suporte a QoS desde os estágios iniciais do ciclo de desenvolvimento dos serviços em si. Quanto à coordenação & comunicação, isso pode evidenciar a necessidade de contribuções, e.g. taxonomias e arcabouços, que provejam maneiras de fornecer informações contextuais no gerenciamento, preservação, distribuição ou atualização de QoS em serviços complexos, dado que a tarefa de coordenação de serviços deve estabelecer tais contribuições. B. Questão de Pesquisa 2 Quais atributos de qualidade são frequentemente considerados nos estudos abordados? Para responder a essa pergunta, olhamos a distribuição dos artigos no eixo de QoS, a faceta de contexto. O eixo vertical da Figura 6 apresenta os resultados dessa distribuição. Vale ressaltar que, como cada artigo pode tratar de múltiplos atributos de QoS, a soma total do número de artigos mapeados em cada um dos atributos não totalizará o número de artigos incluı́dos no mapeamento. Por exemplo, o artigo [14] lida com os atributos de disponbilidade, desempenho e confiabilidade. O mapa mostra que SLA (56.59% ) é o que predomina, seguido de desempenho (37.36% ), disponibilidade (28.57% ) e confiabilidade (27.20% ). Os atributos menos observados são custo (14.01% ), segurança (9.34% ) e escalabilidade (6.04% ). Os atributos que não se enquadraram especificamente em nenhum desses foram classificados em outros (10.44% ) como aqueles que envolvem outros atributos de qualidade como, por exemplo, S. Wang et al. [15] que define um critério de seleção de serviço conforme sua reputação. A partir desses resultados, pode-se notar que, no contexto de SOC, os termos mais relacionados a QoS são SLA, desempenho, disponibilidade e confiabilidade, com bastante ênfase em SLA. Este resultado se justifica uma vez que diversas 60 contribuições para monitoramento e adaptação, descoberta, composição e seleção de serviços, entre outras, não especificam atributos de qualidade, deixando em aberto quais métricas serão usadas ao se instanciar a proposta. Com relação aos dados especı́ficos, conclui-se que desempenho, disponibilidade e confiabilidade são prioridade como atributos de QoS em SOC. Segurança, escalabilidade e custo, no entanto, tiveram menor evidência no trabalhos analisados em nosso estudo, sendo consideradas possı́veis lacunas de pesquisa. SLA no âmbito de composição (dinâmica) e adaptação de serviços e contribuı́ram com 4 publicações e no perı́odo do mapeamento. Particular destaque para as contribuições que lidam com reconfiguração de serviços em Service-Oriented Architecture [27], [28], [29], com restrições de QoS fim-afim. No Brasil em particular, o grupo que mais tem contrbuı́do com o tema no perı́odo do estudo foi o grupo de Rubira et al. com foco em confiabilidade e disponibilidade (particularmente tolerância a falhas), quanto a QoS, e aspectos dinâmicos de SOC (monitoramento & adaptação, coordenação & comunicação, composição) [30], [31], [32]. C. Questão de Pesquisa 3 Quais grupos de pesquisa, no Brasil e no mundo, mais publicam no contexto desse estudo? D. Questão de Pesquisa 4 Para responder a esta questão, identificamos os três principais grupos de pesquisa que mais contribuı́ram com publicacções em SOC no contexto de QoS. Na ferramenta que implementamos para automatizar esse mapeamento é possı́vel identificar vários outros grupos de pesquisa. Mas por uma questão de limitação de espaço, listamos nesse artigo apenas os três grupos que mais se destacaram. Esses grupos foram classificados conforme segue: Grupo S-Cube – Identificamos que dos quatros grupos que mais contribuı́ram no âmbito desse estudo estão pesquisadores cuja afiliação está inserida direta ou indiretamente no contexto do grupo europeu S-Cube [12]. No perı́odo do nosso estudo, o grupo S-Cube contribuiu com 10 publicações relevantes para esse contexto. São os seguintes autores Schahram Dustdar (Vienna University of Techonology) e Raffaela Mirandola (Politecnico di Milano) com seus colaboradores Valeria Cardelini e Emiliano Casalicchio ambos de Universitá di Roma “Tor Vergata”. Em particular, percebemos que Mirandola e seus colaboradores têm contribuı́do em pesquisas relacionadas a monitoramento e adaptação no contexto de confiabilidade, disponibilidade e desempenho como pode ser percebido com as publicações [16], [17], [18], [19], [20]. Dustdar e seus colaboradores, por sua vez, têm contribuı́do em composição de serviços em ambientes dinâmicos no escopo de SLA, com destaque para o VRESCO (Vienna Runtime Environment for Service-Oriented Computing) [21]. Daniel Menasce et al. – Menascé e seus colaboradores têm tradicionalmente contribuı́do com pesquisas relativas a QoS, em particular no âmbito de desempenho, incluindo os diversos fóruns da área [22], [23]. No perı́odo do nosso estudo, Menascé et al. contribuı́ram com 6 publicações e destacaram-se no contexto de SLA nas áreas de descobrimento & seleção e monitoramento & adaptação [24], [25], [26]. Menascé et al. contribuem com o SASSY [24], um arcabouço que gera automaticamente arquiteturas de software candidatas e seleciona aquela que melhor se adequa ao objetivo de QoS. Num outro trabalho [25], eles apresentam um algoritmo que encontra a solução para otimização na busca de provedores de serviço com restrições de custo e tempo de execução. Kwei-Jay Lin et al. – O grupo de Lin (University of California, Irvine) tem contribuı́do também com 6 publicações e em vários aspectos de SOC. Em particular, no contexto de Qual o foco da contribuição de pesquisa realizada em SOC e relacionada com qualidade de serviço? Com o intuito de responder a esta questão, foi feita uma avaliação da distribuição dos artigos em relação ao tipo de pesquisa (conforme discutido na Seção II). O bubble plot na Figura 7 apresenta tal distibuição, novamente sendo importante ressaltar que o número total de artigos nos gráficos é superior ao número total de artigos analisados— uma vez que alguns artigos apresentam contribuições tanto em termos de uma nova solução proposta quanto em termos de avaliação e/ou validação. Por exemplo, Huang et al. propõe um modelo estocástico para representar e raciocinar sobre dependabilidade em um ambiente de SOC, ao mesmo tempo que valida formalmente tal proposta por meio de provas de teoremas [33]. SLA 14 ● 2● 192 96 Seguranca ● 9 2● 26 ● ● 8 Outros ● 4 2● 35 ● 14 ● Quantidade Escalabilidade ● 5 16 ● 2● 50 ● 7 100 Disponibilidade 15 ● 91 3● 46 150 Desempenho Custo Confiabilidade 13 ● 3● 3● 14 ● 118 65 2● 47 31 ● ● 4 85 41 ● Solucão Validacão Avaliacão Experiência 1 ● Pessoal Fig. 7. Bubble plot com a distribuição envolvendo tipos de pesquisa e contexto(QoS) 61 Esta investigação revelou que 155 artigos, isto é, 42.5% do total apresentam, além de uma nova solução relacionada à qualidade de serviços em SOC, uma validação da proposta por meio de experimentos e simulações empı́ricas (e.g. [34], [35], [33], [36]). Por outro lado, 163 artigos ou 44.8% são propostas de soluções que não apresentam validações empı́ricas, isto é, carecem de experimentos que demonstrem a viabilidade e benefı́cios da(s) técnica(s) propostas. Entre esses artigos podemos citar [37], [38], [39], [40]. Finalmente, 33 artigos (9%) foram classificados como avaliação, sendo que 10 apresentam, além de uma nova solução relacionada à qualidade de serviços em SOC, uma avaliação de técnicas correlatas em casos de uso reais, avaliando as limitações e benefı́cios das soluções existentes, e 23 artigos (6.3%) avaliam em casos de uso reais propostas existentes identificando seus benefı́cios e limitações. Esses números revelam que a área de pesquisa de qualidade de serviço em SOC ainda está em uma fase de amadurecimento naquilo que se refere ao uso e adoção em casos de uso reais das propostas, dada o baixo percentual dos artigos classificados como avaliação (9%), o que indica a necessidade de mais pesquisas capazes de comparar diferentes técnicas já existentes. Também observou-se que um percentual significativo das contribuições simplesmente apresentam novas abordagens ou fazem comparações envolvendo a própria técnica proposta (44.8%). V. D ISCUSS ÃO S OBRE OS R ESULTADOS Com base nos resultados evidenciados por esse MS, é possı́vel confirmar vários aspectos relevantes. O resultado desse mapeamento mostra que a maior parte dos trabalhos publicados lida com monitoramento e adaptação (30.49% ), seguida de modelos de QoS & linguagens (31.32% ), descoberta & seleção (30.77% ), composição (28.85% ). Esses resultados indicam o foco dado a aspectos não funcionais, dinâmicos e que podem sofrer variações devido a concorrência e possı́veis falhas dos serviços em tempo de execução. Mas também indicam o foco em propostas que visam a representação e a escolha da melhor configuração de modo a atender aos nı́veis globais desejáveis ou necessários de QoS para um conjunto de serviços. No contexto de SOC, os dados gerais nos induzem a concluir que desempenho, disponibilidade e confiabilidade são prioridade como atributos de QoS em SOC. Segurança, escalabilidade e custo, no entanto, tiveram menor evidência no trabalhos analisados em nosso estudo. Um futuro estudo com base em cada um desses atributos separadamente pode servir de base para se ter um panorama mais especı́fico em cada uma dessas áreas. Mas entendemos que, devido à abrangência de cada um desses tópicos, tornarse-ia inviável tratar dessa questão em nosso estudo. Quanto aos grupos que mais contribuı́ram com publicações para o contexto desse mapeamento, os 3 lı́deres são os seguintes: (1) o S-Cube (com foco em monitoramento & adaptação, assim como composição dinâmica), (2) Menascé et al. (com foco em descobrimento & seleção, assim como monitoramento & adaptação) e (3) Lin et al (composição dinâmica e adaptação). Percebemos também que suas contribuições se encaixam na áreas de SOC que mais de destacaram em nosso estudo. Finalmente, em relação ao tipo de pesquisa em qualidade de serviço em SOC, consideramos que a mesma ainda está em uma fase de amadurecimento, onde um percentual significativo das contribuições simplesmente apresentam novas abordagens ou fazem comparações envolvendo a própria técnica proposta. Na próxima seção apresentamos algumas ameaças à validade do nosso estudo, bem como as estraégias que seguimos para que as mesmas fossem contornadas. Por fim, conforme observamos na Seção IV-D, pesquisas em qualidade de serviço em SOC ainda estão em uma fase de amadurecimento naquilo que se refere ao uso e adoção em casos de uso reais das propostas. Dado que 44.8% dos trabalhos mapeados não realizam validação e que apenas 9% conduzem uma avaliação em caso de uso real, não podemos confirmar nossa hipótese inicial de que pesquisas relativas a QoS em SOC estejam próximas a um patamar de amadurecimento, conforme o ciclo de maturação de Redwine et al. [4]. A discrepância entre o número de pesquisas com validação e aqueles do tipo avaliação pode estar diretamente relacionada à dificuldade de adoção e implementação das propostas em cenários reais, em contraste às simulações realizadas em ambientes controlados de experimentação. Essa pode ser uma realidade importante identificada na pesquisa com foco em SOC no contexto de QoS. Também é importante ressaltar que o ciclo de maturação identificado por Redwine et al. [4] pode não representar a realidade das pesquisas dos últimos 10 anos. Essa hipótese poderia ser verificada num trabalho futuro de modo a se obter uma melhor comparação entre a maturidade esperada e aquela identificada pelo presente MS. VI. A MEAÇAS À VALIDADE As possı́veis ameaças aos resultados dessa pesquisa, bem como a generalizaçã dos mesmos, são descritas nessa seção, conforme a classificação apresentada em [41]. A Validade de construção está relacionada às decisões operacionais que foram tomadas, durante o planejamento do estudo, com o intuito de responder às questões de pesquisa. As principais construções do nosso estudo buscam definir os componentes de SOC (que deram origem a faceta de contribuição), definir o perı́odo de publicações significativo para o estudo e realizar o mapeamento sistemático em si. Quanto ao primeiro item, além de não existir um consenso entre os elementos constituintes da Computação Orientada a Serviços, existe uma certa sobreposição entre os mesmos. Nesse sentido, as definições do projeto S-Cube [12] foram essenciais para possibilitar uma classificação mais precisa dos trabalhos. Quanto ao perı́odo das publicações, o nosso objetivo era identificar tendências de pesquisa na área de QoS em SOC, e o fato de não termos considerado trabalhos publicados após dezembro de 2012 pode representar uma ameaça aos nossos resultados. Por outro lado, consideramos que trabalhos publicados entre janeiro de 2002 e dezembro de 2012 permitem-nos avaliar um perı́odo bastante significativo 62 do estudo, conforme ilustrado na Figura 1. Além disso, consideramos que uma avaliação parcial do ano corrente (2013) poderia levar a avaliações inconclusivas quanto à tendência do estudo. Por fim, quanto à terceira ameaça de construção, para realizar nosso estudos seguimos as diretrizes de referência de Kitchenham [9] para definir as questões de pesquisa, critério de busca e protocolo do estudo. A Validade interna estabelece a relação causal garantindo que certas condições levem de fato a outras condições. Em particular, a ameaça nesse sentido consiste na seleção incompleta ou inadequada das contribuições ou a tendenciosidade da visão individual de cada pesquisador envolvido. Um estudo abrangente como este, envolvendo quatro pesquisadores, leva a algum risco nesse sentido, uma vez que a classificação dos artigos pode envolver certo grau subjetividade na interpretação de cada faceta. Com o intuito de minimizar tal ameaça, algumas reuniões objetivaram esclarecer as facetas do estudo, enquanto que outras serviram para realizar avaliações em pares. Foram feitas diversas discussões entre os autores deste trabalho sempre que a análise de um artigo resultava em dúvidas quanto sua classificação. Finalmente, a seleção correta pode ser percebida tanto pela definição adequada do protocolo na Seção II quanto por resultados conclusivos como aqueles apresentados por principais grupos na Seção IV-C. A Validade externa está relacionada com a possibilidade de generalização do nosso estudo. Esse é um aspecto muito importante que está relacionado ao escopo do nosso estudo, que tem como objetivo entender a pesquisa realizada em QoS / SOC. Por outro lado, o termo QoS é bastante amplo, sendo inviável realizar um estudo compreensivo sobre todos os atributos de qualidade relacionados a SOC (segurança, performance, tolerância a falhas etc.). Nesse sentido, e conforme discutido na seção II, a nossa string de busca se restringiu aos termos quality of services, quality of service e QOS. Tais termos deveriam aparecer em algum campo de metadados de um artigo (como tı́tulo, resumo ou palavraschave) para que o mesmo fosse recuperado pelo minerador desenvolvido. Ou seja, não generalizamos nossos resultados para pesquisas desenvolvidas em uma área especı́fica de QOS, mas sim para a área de QOS de forma mais genérica. Como trabalho futuro, temos o interesse de reproduzir mapeamentos sistemáticos para algumas áreas especı́ficas de QOS (tratamento de exceções, segurança, etc). VII. C ONCLUS ÃO E T RABALHO F UTURO Esse artigo apresenta um mapeamento sistemático na área de Qualidade de Serviço em Computação Orientada a Serviços, visando identificar (a) o tipo e maturidade da pesquisa realizada; (b) os atributos de qualidade e componentes de SOC considerados; e (c) os principais grupos interessados no tema. Os dados obtidos traçaram um panorama da pesquisa relacionada, servindo de insumo e orientação para o aprofundamento de novas pesquisas nas diferentes tópicos envolvidos e identificando aspectos importantes da maturidade da pesquisa dentro do perı́odo avaliado. Consideramos que a definição dos tópicos mais relevantes de QoS em SOC como uma faceta de classificação, constituiu também uma das contribuições do artigo ao final do estudo. Além disso, determinou-se quais aspectos de qualidade tiveram maior destaque entre as pesquisas avaliadas, evidenciando não só o interesse dos pesquisadores, mas também possı́veis lacunas de pesquisa em termos de QoS em ambientes orientados a serviços. Por fim, foi possı́vel a identificação de grupos de pesquisa mais ativos na área em questão, o que possibilita o aprofundamento dos resultados por meio de técnicas de snowballing e também fornece um ponto de partida para revisões sistemáticas de literatura. Como trabalho futuro, consideramos aprimorar a ferramenta de mapeamento criada para viabilizar esse estudo de modo a permitir melhor usabilidade com as bibliotecas digitais e viabilizar o uso de técnicas como snowball. Quanto ao mapeamento sistemático em si, pretendemos utilizar a base de dados aceita para a realização de uma revisão sistemática da literatura (Systematic Literature Review) com vistas a uma avaliação mais aprofundada dos trabalhos avaliados, como por exemplo, seus benefı́cios e limitações, assim como suas contribuições relativas à arquitetura de software. A confiabilidade está relacionada ao grau de objetividade de um estudo, em que uma possı́vel replicação levaria a resultados parecidos. Com a infraestrutura que disponibilizamos para a realização de mapeamentos sistemáticos, podemos tanto estender essa investigação em um trabalho futuro quanto convidarmos outros pesquisadores para replicar tal avaliação1 . Em particular, a estratégia de sı́ntese e interpretação escolhidas podem trazer diferentes insights, mas acreditamos que as tendências observadas devem permanecer as mesmas. 1 Acessı́vel em conta guest pelo endereço http://mapping-study.heroku.com/ 63 R EFERENCES [1] M. P. Papazoglou and D. Georgakopoulos, “Introduction: Serviceoriented computing,” Commun. ACM, vol. 46, no. 10, pp. 24–28, Oct. 2003. [Online]. Available: http://doi.acm.org/10.1145/944217.944233 [2] Y. Wei and M. B. Blake, “Service-oriented computing and cloud computing: Challenges and opportunities,” IEEE Internet Computing, vol. 14, pp. 72–75, 2010. [3] T. Erl, SOA Principles of Service Design. Upper Saddle River, NJ, USA: Prentice Hall PTR, 2007. [4] S. T. Redwine, Jr. and W. E. Riddle, “Software technology maturation,” in Proceedings of the 8th international conference on Software engineering, ser. ICSE ’85. Los Alamitos, CA, USA: IEEE Computer Society Press, 1985, pp. 189–200. [Online]. Available: http://dl.acm.org/citation.cfm?id=319568.319624 [5] A. Dan and P. Narasimhan, “Dependable service-oriented computing,” Internet Computing, IEEE, vol. 13, no. 2, pp. 11–15, 2009. [6] D. Ameller, M. Galster, P. Avgeriou, and X. Franch, “The Role of Quality Attributes in Service-based Systems Architecting: A Survey,” in European Conference on Software Architecture (ECSA), 2013. [7] M. Hilari, X. Gutiérrez, and J. Gómez, “Quality of service (qos) in soa systems: A systematic review,” Master’s thesis, Universitat Politècnica de Catalunya, Barcelona, Spain, 2009. [8] K. Petersen, R. Feldt, S. Mujtaba, and M. Mattsson, “Systematic mapping studies in software engineering,” in Proceedings of the 12th international conference on Evaluation and Assessment in Software Engineering, ser. EASE’08. Swinton, UK, UK: British Computer Society, 2008, pp. 68–77. [Online]. Available: http: //dl.acm.org/citation.cfm?id=2227115.2227123 [9] B. Kitchenham and S. Charters, “Guidelines for performing systematic literature reviews in software engineering,” Keele University and Durham University Joint Report, Tech. Rep. EBSE 2007001, 2007. [Online]. Available: http://www.dur.ac.uk/ebse/resources/ Systematic-reviews-5-8.pdf [10] D. Budgen, M. Turner, P. Brereton, and B. Kitchenham, “Using mapping studies in software engineering,” in Proceedings of the Psychology of Programming Interest Group, PPIG (2008), 2008, pp. 195–204. [11] R. Wieringa, N. Maiden, N. Mead, and C. Rolland, “Requirements engineering paper classification and evaluation criteria: a proposal and a discussion,” Requirements Engineering, vol. 11, pp. 102–107, 2006, 10.1007/s00766-005-0021-6. [12] K. Pohl, “S-cube network of excellence final report,” University of Duisburg-Essen, Tech. Rep., 2012, http://www.s-cube-network.eu/ final-report.pdf. [13] Z. Zheng and M. R. Lyu, “A qos-aware fault tolerant middleware for dependable service composition,” in Proceedings of the 2009 IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2009, 2009, pp. 239–248. [14] R. Calinescu, L. Grunske, M. Z. Kwiatkowska, R. Mirandola, and G. Tamburrelli, “Dynamic qos management and optimization in servicebased systems,” IEEE Trans. Software Eng., vol. 37, no. 3, pp. 387–409, 2011. [15] S. Wang, Q. Sun, H. Zou, and F. Yang, “Reputation measure approach of web service for service selection,” Software, IET, vol. 5, no. 5, pp. 466 –473, oct. 2011. [16] V. Cardellini, E. Casalicchio, V. Grassi, F. Lo Presti, and R. Mirandola, “Qos-driven runtime adaptation of service oriented architectures,” in Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering, ser. ESEC/FSE ’09. New York, NY, USA: ACM, 2009, pp. 131–140. [Online]. Available: http://doi.acm.org/10.1145/1595696.1595718 [17] R. Calinescu, L. Grunske, M. Kwiatkowska, R. Mirandola, and G. Tamburrelli, “Dynamic qos management and optimization in servicebased systems,” IEEE Trans. Softw. Eng., vol. 37, no. 3, pp. 387–409, May 2011. [Online]. Available: http://dx.doi.org/10.1109/TSE.2010.92 [18] D. Ardagna and R. Mirandola, “Per-flow optimal service selection for web services based processes,” J. Syst. Softw., vol. 83, no. 8, pp. 1512–1523, Aug. 2010. [Online]. Available: http: //dx.doi.org/10.1016/j.jss.2010.03.045 [19] V. Cardellini, E. Casalicchio, V. Grassi, S. Iannucci, F. L. Presti, and R. Mirandola, “Moses: A framework for qos driven runtime adaptation of service-oriented systems,” IEEE Transactions on Software Engineering, vol. 38, pp. 1138–1159, 2012. [20] V. Cardellini, E. Casalicchio, V. Grassi, F. Lo Presti, and R. Mirandola, “Architecting dependable systems vi,” in Architecting Dependable Systems VI, R. Lemos, J.-C. Fabre, C. Gacek, F. Gadducci, and M. Beek, Eds. Berlin, Heidelberg: Springer-Verlag, 2009, ch. Towards Self-adaptation for Dependable Service-Oriented Systems, pp. 24–48. [Online]. Available: http://dx.doi.org/10.1007/978-3-642-10248-6 2 [21] A. Michlmayr, F. Rosenberg, P. Leitner, and S. Dustdar, “End-to-end support for qos-aware service selection, binding, and mediation in vresco,” Services Computing, IEEE Transactions on, vol. 3, no. 3, pp. 193 –205, july-sept. 2010. [22] D. A. Menascé and H. Gomaa, “A method for design and performance modeling of client/server systems,” IEEE Trans. Software Eng., vol. 26, no. 11, pp. 1066–1085, 2000. [23] D. A. Menasce and V. Almeida, Capacity Planning for Web Services: metrics, models, and methods, 1st ed. Upper Saddle River, NJ, USA: Prentice Hall PTR, 2001. [24] D. Menasce, H. Gomaa, S. Malek, and J. Sousa, “Sassy: A framework for self-architecting service-oriented systems,” Software, IEEE, vol. 28, no. 6, pp. 78 –85, nov.-dec. 2011. [25] D. A. Menascé, E. Casalicchio, and V. K. Dubey, “On optimal service selection in service oriented architectures,” Perform. Eval., vol. 67, no. 8, pp. 659–675, 2010. [26] V. Dubey and D. Menascé and, “Utility-based optimal service selection for business processes in service oriented architectures,” in Web Services (ICWS), 2010 IEEE International Conference on, july 2010, pp. 542 – 550. [27] K.-J. Lin, J. Zhang, and Y. Zhai, “An efficient approach for service process reconfiguration in soa with end-to-end qos constraints,” in Proceedings of the 2009 IEEE Conference on Commerce and [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] 64 Enterprise Computing, ser. CEC ’09. Washington, DC, USA: IEEE Computer Society, 2009, pp. 146–153. [Online]. Available: http://dx.doi.org/10.1109/CEC.2009.87 K.-J. Lin, J. Zhang, Y. Zhai, and B. Xu, “The design and implementation of service process reconfiguration with end-to-end qos constraints in soa,” Serv. Oriented Comput. Appl., vol. 4, no. 3, pp. 157–168, Sep. 2010. [Online]. Available: http://dx.doi.org/10.1007/s11761-010-0063-6 Y. Zhai, J. Zhang, and K.-J. Lin, “Soa middleware support for service process reconfiguration with end-to-end qos constraints,” in Proceedings of the 2009 IEEE International Conference on Web Services, ser. ICWS ’09. Washington, DC, USA: IEEE Computer Society, 2009, pp. 815–822. [Online]. Available: http://dx.doi.org/10.1109/ICWS.2009.126 D. Leite, C. Rubira, and F. Castor, “Exception handling for service component architectures,” in Dependable Computing (LADC), 2011 5th Latin-American Symposium on, 2011, pp. 84–93. A. S. Nascimento, C. M. F. Rubira, and J. Lee, “An spl approach for adaptive fault tolerance in soa,” in Proceedings of the 15th International Software Product Line Conference, Volume 2, ser. SPLC ’11. New York, NY, USA: ACM, 2011, pp. 15:1–15:8. [Online]. Available: http://doi.acm.org/10.1145/2019136.2019153 E. Gonalves and C. Rubira, “Archmeds: An infrastructure for dependable service-oriented architectures,” in Engineering of Computer Based Systems (ECBS), 2010 17th IEEE International Conference and Workshops on, 2010, pp. 371–378. J. Huang, C. Lin, X. Kong, and Y. Zhu, “Modeling and analysis of dependability attributes of service computing systems,” in Services Computing (SCC), 2011 IEEE International Conference on, july 2011, pp. 184 –191. B. Jeong, H. Cho, and C. Lee, “On the functional quality of service (fqos) to discover and compose interoperable web services,” Expert Syst. Appl., vol. 36, no. 3, pp. 5411–5418, Apr. 2009. [Online]. Available: http://dx.doi.org/10.1016/j.eswa.2008.06.087 D. Ardagna and R. Mirandola, “Per-flow optimal service selection for web services based processes,” Journal of Systems and Software, vol. 83, no. 8, pp. 1512 – 1523, 2010, performance Evaluation and Optimization of Ubiquitous Computing and Networked Systems. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0164121210000750 C. Schröfer, M. Binshtok, S. Shimony, A. Dayan, R. Brafman, P. Offermann, and O. Holschke, “Introducing preferences over nfps into service selection in soa,” in Service-Oriented Computing - ICSOC 2007 Workshops, ser. Lecture Notes in Computer Science, E. Nitto and M. Ripeanu, Eds., vol. 4907. Springer Berlin Heidelberg, 2009, pp. 68– 79. [Online]. Available: http://dx.doi.org/10.1007/978-3-540-93851-4 8 A. Filieri, C. Ghezzi, and G. Tamburrelli, “A formal approach to adaptive software: continuous assurance of non-functional requirements,” Form. Asp. Comput., vol. 24, no. 2, pp. 163–186, mar 2012. [Online]. Available: http://dx.doi.org/10.1007/s00165-011-0207-2 A. S. Nascimento, C. M. F. Rubira, and J. Lee, “An spl approach for adaptive fault tolerance in soa,” in Proceedings of the 15th International Software Product Line Conference, Volume 2, ser. SPLC ’11. New York, NY, USA: ACM, 2011, pp. 15:1–15:8. [Online]. Available: http://doi.acm.org/10.1145/2019136.2019153 Z. Balfagih and M. Hassan, “Quality model for web services from multistakeholders’ perspective,” in Information Management and Engineering, 2009. ICIME ’09. International Conference on, april 2009, pp. 287 – 291. A. Clark, S. Gilmore, and M. Tribastone, “Service-level agreements for service-oriented computing,” in Recent Trends in Algebraic Development Techniques, ser. Lecture Notes in Computer Science, A. Corradini and U. Montanari, Eds. Springer Berlin Heidelberg, 2009, vol. 5486, pp. 21–36. P. Leedy, Practical research: planning and design. Macmillan, 1980. aCCountS: Uma arquitetura orientada a serviços para flexibilizar a tarifação em nuvens de infraestrutura Nayane Ponte∗ , Fernando Trinta∗ , Ricardo Viana∗ , Rossana Andrade∗ , Vinı́cius Garcia† & Rodrigo Assad‡ ∗ Grupo de Redes de Redes de Computadores, Engenharia de Software e Sistemas Universidade Federal do Ceará, Fortaleza - CE Email: {nayaneviana, fernandotrinta, ricardoviana, rossana}@great.ufc.br † Centro de Informática – Universidade Federal de Pernambuco, Recife - PE Email: [email protected] ‡ USTO.RE – Recife, PE Email: [email protected] Abstract—Cloud Computing is a recent paradigm where different IT resources such as applications or hardware are quickly provisioned to customers through a pay per use model. Substantial research has already been conducted concerning pricing services in cloud computing, but they lack on flexibility to establish how services are accounted. These services seem also very dependent on specific cloud infrastructures. This paper proposes an architecture for charging cloud services decoupled from specific providers. This service is complemented by a domain specific language that allows the creation of flexible pricing policies. Such policies aims at supporting cloud billing requirements collected from a survey, allowing the construction of rules that meet different customer profiles. Based on this architecture, a prototype has been implemented and tested to validate our proposal in two different cloud infrastructures. These experiments aims at testing (i) the correctness of the charging flow between two components (server and agent) and (ii) the invoice calculation. I. I NTRODUÇ ÃO A computação em nuvem (do inglês, Cloud Computing) é um tema em grande evidência tanto na indústria quanto na academia. Segundo Vaquero et al [1], o termo computação em nuvem refere-se a um conjunto de recursos virtuais de fácil uso e acesso, que podem ser dinamicamente reconfigurados para se adequarem a uma carga de trabalho variável, permitindo uma otimização no uso de tais recursos. Em geral, estes recursos são provisionados aos seus usuários como serviços. A literatura sobre computação em nuvem apresenta três modelos de entrega de serviços que podem ser vistos como sobreposições em diferentes nı́veis da pilha de uma infraestrutura de TI. No nı́vel mais alto estão as aplicações, ou Software como Serviço (SaaS - do inglês, Software as a Service), nas quais os clientes podem utilizar sistemas de software com fins especı́ficos, e que são acessados, em geral, a partir de um navegador Web. No nı́vel intermediário está a Plataforma como Serviço (PaaS - do inglês, Platform as a Service), que oferece tanto um ambiente de desenvolvimento quanto execução de aplicações, incluindo frameworks, ferramentas de desenvolvimento e testes de aplicações para nuvens especı́ficas. Nestes dois nı́veis, o usuário da nuvem não controla ou administra a infraestrutura subjacente, como rede, servidores, sistema operacional. No nı́vel mais baixo, a Infraestrutura como Serviço (IaaS, do inglês Infrastructure as a Service) oferece máquinas 65 virtualizadas com sistemas operacionais próprios, onde clientes podem instalar e configurar aplicações de acordo com seus interesses. Neste último nı́vel, o cliente da nuvem tem controle sobre as configurações das máquinas virtuais, porém não sobre a infraestrutura da nuvem. Independente do modelo utilizado, recursos em computação em nuvem são tipicamente explorados por um modelo de pagamento por uso (do inglês, pay-as-you-go), com garantias oferecidas pelo provedor do serviço através de acordos de garantias de serviço (do inglês, Service Level Agreements). Em consequência, a tarifação de serviços é apontada como uma caracterı́stica chave para computação em nuvem[2], oferecendo desafios interessantes de pesquisa a serem estudados. Segundo Lucrédio e Silva [3], ferramentas e mecanismos que facilitem o monitoramento e tarifação de recursos, auxiliando as funções administrativas de gerenciamento da infraestrutura da nuvem são imprescindı́veis para o sucesso da computação em nuvem. Este trabalho aborda o tema de oferta de serviços em nuvem ao apresentar uma arquitetura, baseada em um modelo de referência [4], para um serviço de tarifação chamado aCCountS (a Cloud aCcounting Service), voltado para nuvens de infraestrutura. Nesse serviço, o administrador pode definir polı́ticas de tarifação de seus recursos de modo rápido e simplificado, por meio de uma DSL - Linguagem de domı́nio especı́fico (do inglês, Domain Specific Language), intitulada aCCountS-DSL. Esta proposta tem como objetivo principal definir uma solução que melhore a flexibilidade de como os recursos de um provedor de IaaS podem ser precificados e tarifados. Para alcançar tal objetivo, a linguagem aCCountSDSL foi especificada com objetivo de atender diversas questões em aberto no âmbito da contabilização em nuvem, como a definição de polı́ticas de cobrança flexı́veis, de fácil de manutenção e que atendam a diferentes objetivos de cobrança (lucro, bem estar social ou preço justo). Por meio dos artigos encontrados em nossa revisão literária, vários trabalhos citam diferentes critérios (consumo médio de CPU e memória, storage, dentre outros) como elementos interessantes para compor o preço final do aluguel de máquinas virtuais em nuvens. Estes elementos podem ainda tornar a cobrança mais flexı́vel e compreensı́vel (as regras de tarifação) aos clientes de serviços de nuvem. Por meio da aCCountS- DSL é possı́vel criar regras para tarifar diferentes recursos ou requisitos propostos. Ela evita a reprogramação do componente de contabilização da nuvem, alterando apenas suas regras, que são atualizadas dinamicamente pelo sistema. Para validação da proposta, um protótipo do aCCountS foi implementado e testado em duas infraestruturas de nuvens distintas. Estes experimentos objetivaram testar a corretude do (i) fluxo de tarifação entre os dois componentes (agente e servidor) e do (ii) cálculo da fatura. É válido ressaltar que o projeto de uma solução de tarifação envolve muitos aspectos além da flexibilidade, como por exemplo, segurança e escalabilidade. Porém, no contexto deste trabalho, estes aspectos não são abordados. Este artigo está organizado em seis seções. A partir desta introdução, a segunda seção aponta trabalhos relacionados à proposta aCCountS. A terceira seção apresenta uma visão geral sobre o funcionamento do aCCountS. Em seguida, a seção 4 apresenta sua arquitetura destacando seus componentes e a linguagem aCCountS-DSL. A seção cinco descreve a avaliação experimental e a análise da nossa proposta. Por fim, a sexta seção apresenta conclusões e trabalhos futuros. II. T RABALHOS R ELACIONADOS A tarifação em nuvem ainda é um campo recente de pesquisa. Porém, durante a fase inicial deste estudo foram realizadas pesquisas nas quais foram encontrados artigos e trabalhos que já exploraram o tema. Nesta seção são apresentados alguns destes trabalhos que contribuem para a proposta aCCountS. O trabalho de Silva et al [5] teve um papel importante neste levantamento inicial ao apresentar um estudo de mapeamento sobre a tarifação em computação em nuvem. Neste mapeamento, um trabalho em destaque é a proposta de Ruiz-Agundez et al [4] para um processo de tarifação para recursos computacionais. Este trabalho é apontado como a única referência para uma taxonomia completa de um processo de contabilização e tarifação de recursos, conforme descrito na Figura 1. O fluxo de tarifação proposto por Ruiz-Agundez et al [4] é composto de oito tarefas, nas quais o resultado de uma serve como entrada para uma tarefa seguinte. A primeira tarefa é a medição e sua função é monitorar o uso de recursos na nuvem. Estes recursos podem ser o consumo de CPU ou memória de uma determinada máquina virtual. A segunda tarefa é a mediação, que refina os dados recebidos da tarefa anterior, transformando-os em dados mais fáceis de serem manipulados pelas próximas etapas. Após a mediação, a tarefa de contabilidade tem por funções a filtragem, a coleta e a agregação das informações sobre o uso de recursos por um determinado cliente. Como resultado de sua execução, registros de sessão são enviados para a cobrança, que realiza as tarefas relacionadas a geração dos registros de cobrança para um cliente especı́fico. A tarefa de precificação é responsável por definir as operações que serão realizadas no cálculo da tarifação. A tarefa cobrança realiza o cálculo da cobrança a partir dos dados medidos pela nuvem (contabilidade) e dos valores dos recursos monitorados (precificação). Em seguida, tarefa de faturamento recebe os registros de cobrança realizados durante um perı́odo de tempo, para então calcular a fatura completa. Caso o faturamento calculado seja de um cliente de 66 Fig. 1. Fluxo de tarifação, adaptado de Ruiz-Agundez et al [4] outra máquina virtual ou nuvem, esses dados são repassados para a tarefa roaming, cuja função é enviá-lo para a máquina virtual ou nuvem onde o cliente possua cadastro. Por fim, a fatura também é enviada à tarefa de compensação financeira, que realiza o cumprimento do pagamento da fatura. Outra preocupação do estudo proposto pelos autores é a forma que a polı́tica de tarifação é implementada. Segundo Ruiz-Agundez et al [4], cada provedora de nuvem desenvolve seu sistema de contabilização próprio e o codifica diretamente na infraestrutura da nuvem, dificultando manutenções na polı́tica de negócios. Na proposta de Ruiz-Agundez et al [4], diferentes requisitos de nuvem foram levados em conta, com objetivo de tornar a tarifação flexı́vel, permitindo a alteração das polı́ticas de forma fácil. Porém, para validar seu modelo, o fluxo descrito na Figura 1 foi implementado em um serviço de tarifação comercial, chamado de JBilling1 , o que impede que maiores detalhes a respeito da solução possam ser estudados. Trabalhos interessantes também podem ser obtidos no campo da computação em grade, e que podem ser aproveitados para a computação em nuvem. Caracas e Altmann [6] propõem tarifar os serviços de uma grade levando em consideração fatores como a quantidade de recursos, o tempo no qual tais recursos foram utilizados, a qualidade do serviço e o perfil do usuário. Além disso, é proposta uma definição dinâmica de preços, na qual a tarifação utiliza componentes da infraestrutura da grade para disponibilizar dados para o faturamento. Já Narayan [7] propõe que os esquemas de tarifação deveriam levar em consideração não apenas os perfis de hardware utilizados e seu tempo de uso, mas o próprio uso dos recursos, 1 http://www.jbilling.com/ como memória, CPU e disco. Além de estudos acadêmicos, foi também verificado como a indústria utiliza a tarifação em serviços de computação em nuvem. A utilização desses serviços de tarifação disponı́veis no mercado, tais como JBilling e Cloud Billing2 , são baseadas em BRMS - Business Rule Management System (Sistemas de regras de negócios). Apesar desse tipo de sistema proporcionar uma facilidade na definição das regras de negócio por usuários que não tem conhecimentos em programação, segundo Hartmann Software Group [8], (i) os BRMS não são otimizados para tarefas computacionais que requerem alto processamento, (ii) não são ambientes ideais para escrever algoritmos complexos (diminuindo a eficiência do mecanismo de regras e tornando difı́cil para os usuários definirem formas mais complexas) e (iii) perturbações em seus modelos de objeto (ou regras) podem ter implicações em todo o sistema (dado que as regras são definidas por outras regras ou modelos de objetos, criando um caminho de dados para realizar a inferência). Outra questão levantada nos artigos estudados é quanto ao objetivo da polı́tica de tarifação. Segundo Caracas [6], uma polı́tica de tarifação de uma provedora de nuvem busca alcançar tipos diferentes de finalidades, tais como maximizar os lucros, maximizar o bem estar social ou definir um esquema de preço justo. A forma como os serviços de nuvem são cobrados no mercado não parece justa quando dois clientes que requisitam instâncias iguais na nuvem pagam o mesmo preço, mesmo o primeiro utilizando 90% dos recursos da máquina virtual instanciada, enquanto o segundo usa apenas 10% de uma instância semelhante. Também não parece pensar no bem estar social, quando não se preocupa com o grande gasto de energia nos centros de dados, não promovendo incentivos a clientes que aceitariam medidas com propósito de diminuir a demanda por energia [6]. Tomando por base os trabalhos encontrados relacionados à tarifação, o aCCountS utiliza partes de cada um deles. No caso do Ruiz-Agundez et al [4], seu processo de tarifação é utilizado, porém sem a tarefa de roaming. O aCCountS define uma linguagem de domı́nio especı́fico para definição de polı́ticas com a possibilidade de definir um conjunto de recursos monitores. Nestas polı́ticas, os critérios propostos por Caracas e Altmann [6] podem ser explorados. Por fim, as polı́ticas definidas na linguagem aCCountS-DSL também podem utilizar a proposta de Narayan [7], contabilizando a quantidade de recursos consumidos, como por exemplo, dando desconto a usuário que consome apenas 50% de seus recursos na nuvem. III. aCCountS - Cloud aCCOUNTing Service Com base nos estudos levantados, este trabalho propõe uma nova forma de contabilização para serviços em nuvem. Essa proposta constitui-se de dois elementos. O primeiro é um serviço de tarifação em nuvem, que possui seu fluxo de execução na proposta de Ruiz-Agundez et al, explicada anteriormente. O serviço é formado por dois componentes, o agente de monitoramento e o componente de tarifação. O agente monitora os recursos das máquinas virtuais e o componente interpreta regras de tarifação definidas pela provedora de nuvem (administrador), contabiliza a utilização destes 2 http://www.ibm.com/developerworks/cloud/library/cl-devcloudmodule/ 67 recursos e gera uma conta de tarifação para o cliente. O segundo elemento é uma linguagem de domı́nio especı́fico, chamada aCCountS-DSL, que permite a administradores de nuvem definirem regras de tarifação para recursos de sua infraestrutura no serviço. A Figura 2 mostra uma visão geral sobre o aCCountS. Fig. 2. Visão Geral da Proposta Em linhas gerais, pode-se separar a utilização do serviço em duas etapas. Na etapa inicial é feita sua configuração. Nela, o administrador da nuvem precisa definir os parâmetros de tarifação, ou seja, quais serão os recursos monitorados nas máquinas virtuais de sua nuvem, e que serão utilizados para composição das regras de tarifação. Em geral, estes recursos incluem uso de CPU, memória, disco, tempo de uso, transações em bancos de dados, serviços e software utilizados, dentre outros. Após esta definição, faz-se necessário estabelecer valores (parâmetros de precificação) para os recursos que serão monitorados. Na proposta do aCCountS, a precificação é fortemente atrelada ao conceito de perfil de uma máquina virtual. Este perfil refere-se às possı́veis configurações das máquinas virtuais que podem ser instanciadas na nuvem, de forma semelhante ao que acontece em infraestruturas de nuvem conhecidas, como Amazon EC2 ou GoGrid. Por exemplo, na Amazon AWS, o usuário pode escolher entre instâncias do tipo Micro, Pequena, Média, Grande e ExtraGrande[9]. Cada uma destas instâncias tem valores diferentes para seu uso. Com estes parâmetros definidos, o administrador pode criar as polı́ticas de tarifação de sua nuvem utilizando a aCCountSDSL. Neste caso, as polı́ticas definem as diferentes práticas em relação a cobrança pelo uso de recursos de suas máquinas virtuais instanciadas. Estas polı́ticas podem incluir uma tarifação apenas pelo tempo de uso de uma máquina virtual, ou podem agregar taxas em relação ao uso dos recursos de rede, média de consumo de memória, quantidade de transações no banco de dados, além de descontos por economia de energia, SLA ou perfil do cliente. Almeja-se utilizar os diferentes requisitos de contabilização para criar uma polı́tica flexı́vel. A segunda etapa diz respeito à execução do serviço. Para isto, em cada máquina virtual, um agente é responsável por coletar informações dos parâmetros de tarifação e repassá-las a um repositório. De posse das informações sobre consumo de recursos das máquinas virtuais e de como os recursos devem ser tarifados, o tarifador pode então gerar uma conta de consumo para clientes da nuvem. Para melhor especificar cada uma destas etapas, a próxima seção aborda em mais detalhes (i) a arquitetura do serviço Fig. 3. Diagrama de componentes UML com os principais elementos de aCCountS proposto e (ii) a linguagem de domı́nio especı́fico, aCCountSDSL IV. O iParameters faz uma chamada ao aCCountS-Service, solicitando quais parâmetros deverão ser monitorados. Esses parâmetros já foram configurados com os recursos que serão medidos e a forma de medı́-los. Para exemplificar esta configuração, o Código 1 mostra variáveis que foram cadastradas para medir uso de CPU, memória e disco, com a respectiva forma de capturar seus valores para o sistema operacional Linux na Amazon EC2. A RQUITETURA DO aCCountS A arquitetura do aCCountS é representada na Figura 3, por meio de um diagrama de componentes UML. Nesta figura, os elementos marcados com o estereótipo Entity representam abstrações que denotam dados manipulados na proposta aCCountS. Os demais componentes da arquitetura assumem o papel de gerenciadores destes recursos e exercem papel de processamento das etapas do fluxo de tarifação. A arquitetura aCCountS possui dois macro-componentes: (i) o aCCountSAgent e o (ii) aCCountS-Service. O primeiro representa o agente de coleta de informações, e que deve estar presente em cada máquina virtual em execução na nuvem. O segundo representa um serviço que recebe os dados monitorados, e que através da configuração de seus sub-componentes, consegue gerar a tarifação das máquinas virtuais associadas a um determinado cliente. A divisão do aCCountS em macro-componentes foi motivada pela ideia de um serviço de tarifação em nuvem desacoplado da infraestrutura da mesma. Com isto, objetiva-se a concepção de um componente reutilizável, no caso o serviço, que pode ser utilizado para realizar o processo de tarifação para diferentes nuvens de infraestrutura. Código 1. 1 2 3 Definição dos recursos a serem medidos - Amazon EC2 { "cpu": uptime | awk -F’: ’ ’{print $2}’ | awk -F’, ’ ’{ print $1}’, "memoria": free -m|grep Mem|awk ’{printf("%f", $3/$2)}’, "armazenamento": df -k ’/dev/disk/by-label/cloudimg-rootfs ’ | grep udev | awk ’{printf("% f", $3/$4)}’ } O iResources periodicamente envia ao aCCountS-Service as informações de uso da máquina virtual para o cálculo da fatura. O Código 2 mostra um exemplo de um arquivo JSON enviado pelo aCCountS-Agent via iResources. Nas próximas duas sub-seções serão descritos em mais detalhes como os subcomponentes aCCountS-Agent e aCCountS-Service operam para o estabelecimento do fluxo de tarifação. Código 2. 1 A comunicação entre o aCCountS-Agent e o aCCountSService é realizado por meio de troca de mensagens no formato JSON3 . Como ilustrado na Figura 3-(b), o aCCountSService fornece uma API para comunicação com o agente. A interface iParameters serve para que o aCCountS-Agent obtenha do serviço, quais são os recursos (parâmetros) a serem monitorados e como estes devem ser obtidos da máquina virtual onde o agente está sendo executado. Já a interface iResources permite que o agente envie os registros de medição, a partir dos recursos monitorados. Dessa maneira, qualquer empresa de nuvem pode criar seu próprio agente para tarifar seus serviços, desde que mantendo a compatibilidade com a API fornecida. 3 http://www.json.org/ 68 Exemplo de um registro de medição { "cpu": 0.06068, "memoria": 0.54445, "armazenamento": 0.05128 } A. O Agente - aCCountS-Agent O aCCountS-Agent é formado pelos sub-componentes AgentReceptorManager, o MeteringVM, MediatorVM e pelo AgentSendManager, como modelado pelo diagrama de componentes mostrado na Figura 3 - (a). Cada sub-componente do aCCountS-Agent realiza um passo para atender a funcionalidade de monitoramento de recursos de uma máquina virtual. Estes passos são (i) identificar os recursos que devem ser monitorados, (ii) realizar a medição e (iii) a mediação destes recursos, e por fim, (iv) enviar as informações de consumo para o aCCountS-Service. Na arquitetura do aCCountS-Agent, o AgentReceptorManager é responsável por buscar no aCCountS-Service os parâmetros (API iParameters), interpretá-los e enviá-los ao sub-componente MeteringVM. Um AgentReceptorManager de uma máquina virtual está associado a uma máquina no aCCountS-Service por meio de um identificador único (id). Na requisição dos parâmetros, o agente utiliza este identificador, permitindo ao aCCountS-Service reconhecer a polı́tica de tarifação da máquina virtual e quais recursos que devem ser monitorados, para então, fornecê-los de forma correta ao agente, de acordo com seu sistema operacional. Cada parâmetro recebido pelo AgentReceptorManager é um arquivo JSON com uma lista dos recursos e seus respectivos comandos para serem monitorados a partir do Sistema Operacional implantado na máquina virtual. Dado que os (i) parâmetros e seus comandos de execução, (ii) a polı́tica definida para tarifação da máquina virtual e (iii) seu perfil de precificação foram definidos na primeira parte do processo, a etapa de configuração do aCCountS está completada. Por meio dos comandos passados pelo AgentReceptorManager, o MeteringVM mede a utilização de cada recurso na máquina virtual em uma certa frequência predeterminada e envia esses registros para o componente MediatorVM que os intermedia. O MediatorVM recebe os dados medidos, calcula a média de utilização dos recursos e a cada hora, gera um registro de medição. Esses registros são conduzidos para o AgentSendManager, que os expede, uma vez ao dia, para o aCCountS-Service (API iResources). Os tempos de medição e de envio dos registros ao serviço podem ser configurados pela administrador da infraestrutura de nuvem, mas a configuração padrão no MeteringVM indica que os dados são monitorados a cada minuto, enquanto o MediatorVM realiza suas funções em intervalos de uma hora. Esta decisão pretende não sobrecarregar a rede com informações de medição a cada minuto no tráfego de dados entre agente e serviço, nem o próprio aCCountS-Service. O aCCountS-Service com posse das informações de consumo de uma máquina virtual inicia seu processo de tarifação por meio dos seus sub-componentes, como especificado na próxima subseção. B. O Serviço - aCCountS-Service Como foi declarado, o aCCountS-Service foi idealizado para ser um serviço de tarifação para qualquer provedora de nuvem. Para isto, é necessário instalar um agente nas máquinas virtuais em execução, de tal modo que estes agentes busquem os parâmetros (recursos e comandos para serem monitorados) no aCCountS-Service (interface iParameters), para em seguida monitorar o uso da máquina virtual e enviar ao serviço os registros de mediação (média de uso dos recursos) ao serviço (interface iResources). A tarifação de cada máquina virtual é processada por meio dos sub-componentes do aCCountS-Service, que recebem os registros de medição e geram uma fatura mensal para o cliente. Esses sub-componentes são o ResourcesManager, o AccountingManager, o PriceManager, o ProfileManager, o BillingManager e o ClientManager. O PriceManager é ainda subdivido em outros três componentes, a saber: PolicyManager, DSL-Compiler e ChargeManager. A arquitetura do aCCountS-Service é ilustrada na Figura 3-(c) por meio de um diagrama de componentes UML. 69 No aCCountS-Service, o ResourcesManager tem a função de enviar os parâmetros para o agente, quando solicitado. Duas entidades são manipuladas por este componente: o Resources e o CommandsResources. A primeira representa os recursos a serem monitorados, enquanto a segunda, os comandos que os agentes devem executar para medir tais recursos. Esta configuração deve ser feita pelo administrador da provedora da nuvem, que cadastra todos os recursos que poderão ser monitorados por ela e os seus respectivos comandos. Em nosso protótipo, esta configuração é feita por meio de uma interface Web. O ResourcesManager ao ser consultado pelo agente, recebe o identificador da máquina virtual. Por meio deste identificador, sua polı́tica de tarifação é recuperada. Com isto, o ResourcesManager verifica na polı́tica quais recursos precisam ser medidos e os envia por meio de parâmetros (recursos e comandos) ao agente. Já o AccountingManager tem a função de receber os registros de medição do agente e ativar o componente PriceManager para contabilizar os registros recebidos. Este componente é responsável por gerar a função de precificação para contabilizar os registros de medição, além de calcular o custo do serviço e gerar as cobranças. Estas tarefas são distribuı́das entre seus sub-componentes: PolicyManager, DSL-Compiler e ChargeManager. O PolicyManager gerencia a criação das polı́ticas de tarifação. Essas polı́ticas são definidas através da DSL proposta nesse trabalho que será descrita com mais detalhes na sub-seção IV-C. O DSL-Compiler compila a polı́tica de tarifação definida e gera uma função de precificação, enquanto o ChargeManager utiliza a função definida para calcular os registros de cobrança por meio dos recursos monitorados no agente e dos preços dos recursos (requisitados do componente ProfileManager). O componente ProfileManager é responsável por gerenciar o perfil das máquinas virtuais do agente. Ele manipula duas entidades: o Resources que representa cada recurso, e o Price que representa os preços desses recursos. Estes componentes são cadastrados pelo administrador da provedora de nuvem durante o processo de configuração do serviço. No ProfileManager, além dos recursos e seus preços, deve-se cadastrar os perfis disponı́veis na infraestrutura da nuvem, definindo a quantidade de memória, armazenamento em disco e tipo de processador das máquinas virtuais que podem ser instanciadas. Os registros de cobrança calculados pelo ChargeManager são enviados para o componente BillingManager, cuja principal função é somá-los para geração da fatura que será enviada para o ClientManager ao final de cada mês, caso o modelo de tarifação seja pós-pago. Caso o modelo da polı́tica de tarifação seja pré-pago, uma mensagem é enviada para a provedora da nuvem no momento em que o total de créditos do cliente for esgotado. O valor do crédito é medido pelo agente e enviado para o AccountingManager via iResources. Com essa informação a provedora da infraestrutura de nuvem pode tomar providências pela continuidade ou não da oferta de serviços ao cliente cujo crédito se esgotou. C. A DSL de tarifação - aCCountS-DSL No contexto deste trabalho, uma polı́tica de tarifação é definida como um conjunto de regras que estabelece a forma como recursos das máquinas virtuais de um cliente são tarifados. A aCCountS-DSL é uma linguagem de domı́nio especı́fico textual para criação de polı́ticas de tarifação em nuvens de IaaS dentro da proposta aCCountS. O projeto da aCCountS-DSL foi elaborado a partir de requisitos obtidos dos trabalhos relacionados previamente descritos, bem como a partir de exemplos da indústria, com destaque para a Amazon EC2, uma vez que esta se trata da principal fornecedora de serviços de infraestrutura em nuvem do mundo, segundo Marten Mickos4 . Para ilustrar esta influência, na etapa de configuração (primeira parte do processo de tarifação), o perfil da máquina é definido no aCCountS-Service pelo administrador, que precifica os recursos que serão contabilizados. Um conjunto de requisitos de tarifação determina o valor de um perfil, muito semelhante ao que ocorre na Amazon EC2, em que os diferentes requisitos definem as diferentes instâncias e seus preços. Por exemplo, uma instância localizada na Virgı́nia, cuja polı́tica é sob-demanda e o perfil da máquina é small-instance custa $ 0.06 por hora de uso. Se sua polı́tica for reservada por um ano, custará $ 0.034 por hora de uso. Se o perfil da máquina modificar, passando para medium-instance, o custo do serviço passará a $ 0.12 por hora de uso. A aCCountS-DSL busca permitir que qualquer nuvem possa definir suas regras e criar sua polı́tica de tarifação através de uma linguagem que atenda aos requisitos utilizados pela Amazon e aos requisitos apontados nos trabalhos relacionados apresentados na seção II. Os requisitos utilizados pelo aCCountS são apresentados a seguir. 1.1 - Garantia de alocação: Algumas polı́ticas de tarifação promovem maior garantia à nuvem do que outras, impactando nos custos dos seus recursos. A Amazon[9] utiliza três modelos e os mesmos são suportados pela linguagem e serviço propostos. São eles: (i) sob-demanda, que permite ao cliente pagar pela capacidade computacional utilizada por hora, sem nenhum compromisso de longo prazo, (ii) Reservado, que oferece a opção de um pagamento único e acessı́vel para cada instância que o cliente deseja reservar e, em troca, um desconto significativo sobre a custo de utilização dos serviços para essa instância, e (iii) Lance, que é concedido por meio de leilão dos recursos subutilizados, sem uma garantia sobre o tempo de disponibilidade do serviço. Este requisito é definido no momento da configuração do perfil, na qual os preços dos recursos variam conforme seu requisito. 1.2 - Modelo de pagamento: Os modelos propostos pela Amazon atualmente são pós-pagos. Entretanto, outros serviços de tecnologia aumentaram sua utilização através de planos pré-pagos, como a telefonia móvel e a Internet móvel [10]. Dessa maneira, a linguagem proposta reconhece dois modelos possı́veis de pagamento. O primeiro é o modelo pré-pago, no qual um cliente paga antes de utilizar os recursos, podendo limitar seu uso pelos créditos comprados. Já o modelo póspago reflete o modo tradicional em que um cliente contrata um plano de utilização e paga após o uso, em geral numa frequência mensal. Estes modelos foram influenciados pela proposta de Elmroth [11] e pelos serviços de tecnologia citados anteriormente. O modelo de pagamento é suportado pela arquitetura pro4 www.livestream.com/gigaomtv/video?clipId=pla 7a8babfe-7b50-472ebbb8-1619d153ada4&utm source=lslibrary&utm medium=ui-thumb 70 posta, entretanto ainda não está completamente funcional no protótipo implementado. A aplicação desse requisito acontece em quatro etapas, sendo que as três primeiras ocorrem no momento de configuração do serviço, enquanto a quarta ocorre durante a execução do mesmo. A primeira etapa acontece na configuração dos parâmetros, em que o administrador define os recursos de modelo de pagamento e crédito, e os associa a um determinado cliente. A segunda etapa ocorre quando da criação dos perfis das máquinas, e leva em consideração o modelo da polı́tica. No caso, definem-se preços maiores para perfis com modelo pré-pago. A terceira etapa permite ao administrador configurar os tempos de medição e de envio de dados para o serviço. Já na última etapa, o agente faz medições dos recursos de crédito e de tipo de modelo de pagamento. 1.3 - Forma de contabilização: Segundo alguns autores [12], [6], [11], [7], um serviço em nuvem consome diferentes recursos e em quantidades diferentes, sendo portanto justo pagar apenas pelos recursos consumidos, evitando que aqueles que utilizem pouco paguem o mesmo que outros que sobrecarregam o sistema. Desta forma, este requisito estabelece que um cliente possa ser tarifado por uso dos recursos (Ex: uso de 10% de CPU de uma máquina virtual em uma hora custa $ 0.003) ou por tempo de uso (1h de uso de uma máquina custa $ 0.3) ou ainda por uma opção hı́brida definida pela provedora de nuvem. Esse requisito é suportado na definição da polı́tica de tarifação. 1.4 - Perfil de hardware: máquinas com diferentes capacidades de processamento e disponibilidade de armazenamento e memória apresentam preços diferentes. Por exemplo, na Amazon EC2, uma instância Linux com perfil small (com disponibilidade de recursos pequena) custa $ 0.06 por hora, enquanto que o perfil medium aumenta seu valor para $ 0.12 por hora. Se o perfil escolhido for large, o custo aumenta para $ 0.24 por hora. Este requisito é suportado pela arquitetura por meio da configuração dos perfis das máquinas. O administrador deve levar em consideração o tipo de hardware para precificar os recursos do perfil. 1.5 - Utilitários: Na configuração de uma máquina virtual para um cliente, recursos adicionais podem ser associados à tarifação. Por exemplo, clientes podem ser tarifados pelo uso de software proprietário (como o sistema operacional Microsoft Windows), por serviços disponibilizados pela provedora de nuvem (como elasticidade automática) ou por uso de centro de dados localizados em regiões estratégicas com maior disponibilidade ou recursos adicionais. Por exemplo, no centro de dados da Amazon EC2 localizado na Virgı́nia (EUA), uma instância small utilizando o Linux custa $ 0.06 por hora, enquanto no centro de dados da Amazon situado em São Paulo, a mesma instância custa $ 0.08 por hora. Em geral, software, serviços e datacenters requerem custos adicionais à provedora de nuvem, sendo portanto, requisitos a serem considerados na definição da polı́tica de tarifação. 1.6 - Bem Estar social: Este conceito relaciona-se com a ideia de se promover descontos na fatura do cliente em determinadas situações, como o uso de máquinas virtuais configuradas visando a economia de energia [13], grande quantidade de recursos utilizados por um cliente especı́fico [12], recursos da nuvem sub-utilizados [6] e ainda questões que atendam aos SLAs (multa às provedoras, quando houver alguma violação nesses acordos) [12]. Esses requisitos são definidos nas regras de cobrança. Para melhor ilustrar o uso da aCCountS-DSL, o Código 4 apresenta a definição de uma polı́tica, intitulada SobMedUsoPosPlus. Seu nome referencia alguns dos requisitos que ela atende, a saber: Sob-demanda(garantia de alocação), Media(perfil de máquina), Uso(forma de contabilização), Pospago(modelo), Plus, por contabilizar questões a mais (requisitos Utilitários e de Bem estar social). Nela são definidas dez variáveis para auxiliar a definição das regras: taxaCentralDados, memoria, cpu, armazenamento, transacaoBD, upload, descontoEnergia, descontoUsuario, desconto e custo. A partir da definição dos requisitos e de como é fornecido o suporte aos mesmos, a flexibilidade da tarifação é alcançada pela combinação das várias possibilidades de tarifação em polı́ticas que possam atender diferentes tipos de clientes (com necessidades, recursos e perspectivas diferentes) e atender a diferentes objetivos de cobrança por parte das provedoras de nuvem (lucro, bem estar social e preços justos). O conjunto de requisitos apresentado serviu de base para a definição da linguagem aCCountS-DSL. Através dela podese definir diferentes conjuntos de regras para atender aos 12 diferentes requisitos propostos. Eles podem ser combinados 3 entre si, criando uma sequência de instruções (regras) para 4 5 definir a polı́tica de tarifação. De modo a ilustrar seus prin- 6 cipais elementos, o Código 3 apresenta a estrutura geral de 7 uma polı́tica descrita usando a linguagem. Ressalta-se que seu 89 objetivo maior é a flexibilidade na definição das polı́ticas de 10 tarifação, de modo a utilizar diferentes recursos das máquinas 11 virtuais de uma nuvem de infraestrutura, que possam ser 1312 medidos e precificados. 14 Código 4. memoria = instance.memoria * $memoria; cpu = instance.cpu * $cpu; armazenamento = instance.armazenamento * $armazenamento; transacaoBD = instance.transacaoBD * $transacaoBD; upload = instance.upload * $upload; 15 Código 3. 1 2 3 4 5 6 7 8 9 10 11 16 Definição de uma polı́tica de tarifação com a aCCountS-DSL Polı́tica de tarifação criada por meio da aCCountS-DSL Policy SobMedUsoPosPlus { var { taxaCentralDados, memoria, cpu, armazenamento; upload,transacaoBD, custo, descontoEnergia; descontoUsuario, desconto; } rules{ descontoEnergia = 0; descontoUsuario = 0; taxaCentralDados = 0.14; 17 Policy nome_da_politica [extends outra_politica] { var { definicao das variaveis auxiliares para definir regras de negocio; } rules{ definicao de regras de negocio atraves de operacoes aritmeticas e logicas; } return valor_da_cobrança; } if (instance.economiaEnergia >= 0.5 ){ descontoEnergia = 0.03; } if (instance.memoria >= 0.8 ){ descontoUsuario = 0.05; } if (instance.armazenamento >= 0.8 ){ descontoUsuario = 0.04; } 18 19 20 21 22 23 24 25 26 27 custo = memoria + cpu + armazenamento + transacaoBD + upload + instance.software + instance.servico + taxaCentralDados; desconto = instance.sla + descontoEnergia + descontoUsuario; custo = custo - custo * desconto; 28 Os elementos que estruturam a polı́tica são definidos da 29 seguinte forma: O elemento policy representa o nome de uma 30 polı́tica especı́fica. Este elemento divide-se em duas seções. 31 Na primeira seção, especificada pela palavra-reservada var, 32 são definidas variáveis auxiliares que servirão para compor as 33 regras de tarifação. Como restrição da linguagem, as variáveis utilizadas só podem guardar valores numéricos de ponto flutuante, uma vez que as polı́ticas tipicamente trabalham com manipulação de valores deste tipo. A segunda seção, chamada rules, define as regras de composição da polı́tica de tarifação por meio de atribuições, comandos de seleção e operações aritméticas sobre variáveis e valores reservados na linguagem. As regras definidas na linguagem representam as regras de negócio da polı́tica, e são definidas em função dos recursos medidos em cada máquina virtual. Ao final da definição de uma polı́tica, utiliza-se a palavra-reservada return, seguida da variável que representa o cálculo final do custo para uma polı́tica de tarifação, em função dos valores calculados nas regras. De modo a facilitar a definição de novas polı́ticas, podese reutilizar uma polı́tica pré-definida, similarmente a ideia de herança (especialização), bastante conhecida em linguagens de programação. De modo semelhante, variáveis e regras da polı́tica pré-definida são reaproveitadas, além de se permitir que novas regras e variáveis sejam introduzidas ou ainda sobrepostas. Para isso, usa-se, após o nome da polı́tica, a palavra reservada extends e em seguida, a polı́tica que se quer estender. 71 } return custo; } Um aspecto fundamental para a polı́tica de tarifação é que a mesma precisa ter acesso aos valores definidos durante a configuração do serviço em relação aos recursos tarifados e seus respectivos preços. Para este propósito, a aCCountSDSL utiliza dois elementos. O primeiro é representado pela expressão instance.recurso, que representa o valor de um recurso monitorado (Por exemplo, percentual médio de uso de memória) em uma instância de uma máquina virtual. O segundo é representado pelos valores definidos na precificação dos recursos, e que são acessados utilizando como prefixo o sı́mbolo $. Esses valores são recuperados a partir dos componentes do aCCountS-Service. O suporte à execução das polı́ticas definidas na linguagem aCCountS-DSL é feita pelo DSLCompiler. Este componente atua como um compilador, verificando sintaticamente as polı́ticas e gerando versões das mesmas para outras linguagens, que então podem ser executadas. No caso particular de nossa proposta, o DSLCompiler teve uma grande influência do Xtext5 , um framework open-source para o desenvolvimento de DSLs. O Xtext permite que a partir da definição das regras de 5 http://www.eclipse.org/Xtext/ sintaxe da linguagem, sejam gerados seus analisadores léxico e sintático, além da personalização de um gerador de código, no qual pode-se associar os comandos da DSL desenvolvida com os de alguma outra linguagem de programação. Especificadamente para o aCCounts, escolheu-se Ruby, uma linguagem de crescente popularidade para desenvolvimento de aplicações Web de modo ágil e rápido. Porém, sua caracterı́stica mais importante para nossa proposta é o fato da mesma interpretar dinamicamente seu código-fonte. Com isso, o código gerado pelo Xtext pode ser integrado naturalmente ao serviço, em tempo de execução, permitindo que modificações nas polı́ticas sejam refletidas de imediato na tarifação dos recursos. variáveis e regras. Com isso a polı́tica Res1MedUsoPosPlus contém onze variáveis, sendo dez da polı́tica estendida e a décima primeira, a variável local taxaFixa. Esta nova variável representa taxa paga pelo firmamento do contrato de 1 ano, que no contexto da polı́tica vale $ 0.0014 mensais. A regra atribuı́da à variável custo é redefinida, e uma nova polı́tica é definida com poucas linhas de código. Código 5. 1 2 3 4 5 No processo de geração de código para uma polı́tica de 67 tarifação, o DSLCompiler entende que as ocorrências instance 8 são recursos medidos na máquina virtual. Estes valores devem 9 ser recuperados no AccountingManager, que através do nome 10 da variável, retorna os dados dos recursos medidos em uma máquina virtual em um determinado perı́odo. Já as variáveis iniciadas com $ equivalem aos preços dos recursos e, por sua vez, devem ser obtidos no componente ProfileManager, através da entidade Price, que retorna o preço correspondente ao recurso identificado. Como detalhado na explanação sobre os componentes do aCCounts, os nomes definidos para os recursos são previamente configurados no componente ResourcesManager, e para simplificar os processos, o nome do preço é, por convenção, o mesmo nome do recurso. Com isto, o DSLCompiler consegue gerar um código executável em Ruby, com regras equivalentes às definidas na DSL proposta de forma bem simples, uma vez que as construções da aCCountS-DSL são restritas a operações aritméticas, comandos de seleção e reutilização de regras. Na polı́tica SobMedUsoPosPlus, os valores de instance.economiaEnergia e instance.sla são medidos na máquina virtual. Eles representam porcentagens de economia de energia realizada por um usuário e de violação no acordo de serviço (por exemplo, tempo que a máquina ficou indisponı́vel), evidenciados na máquina virtual respectivamente. Esses valores são calculados com a finalidade de verificar as configurações das máquinas virtuais do usuário e o quanto elas promovem a economia de energia, baseado em análises de desempenho [14] e, no caso do SLA, o quanto os acordos de qualidade de serviços são atendidos ou violados. As variáveis de instância instance.memoria, instance.cpu, instance.armazenamento, instance.transacaoBD, instance.upload são medidas de processamento na máquina virtual. As variáveis descontoEnergia e descontoUsuario são definidas a partir de comandos de seleção declaradas na polı́tica, atendendo a requisitos relacionados ao Bem estar social e Utilitários. A taxaCentralDados, definida na polı́tica com o valor de $ 0.14 mensais, é um valor pago pelo uso das máquinas do centro de dados da provedora de nuvem. O Código 5 exemplifica o caso da extensão de uma polı́tica pré-existente, em que é utilizado a palavra reservada extends. Na polı́tica Res1MedUsoPosPlus, seu nome evidencia alguns dos requisitos que ela atende: Reservada 1ano (garantia de alocação), Media (perfil de máquina), Uso (forma de contabilização), Pos-pago (modelo), Plus (requisitos Utilitários e Bem estar social). Ela estende a polı́tica SobMedUsoPosPlus, representada na Figura 4 e dessa forma reutiliza todas suas 72 Polı́tica criada por meio da aCCountS-DSL usando extends Policy Res1MedUsoPosPlus extends SobMedUsoPosPlus{ var { custo, taxaFixa; } rules { taxaFixa = 0.0014; custo = custo + taxaFixa; } return custo; } Após a configuração realizada (polı́ticas definidas por meio da aCCountS-DSL, preços e recursos determinados no perfil das máquinas), a execução do serviço pode ser iniciada para o aCCountS realizar a tarifação das nuvens que a utiliza. V. AVALIAÇ ÃO E XPERIMENTAL Para a análise da proposta foram realizados experimentos no aCCountS e na aCCountS-DSL, a partir da prototipação de uma aplicação Web, utilizando o framework Ruby on Rails. A escolha de Ruby permitiu implementar as interfaces utilizadas pelo agentes como serviços Web, com a troca de mensagens obedecendo o formato JSON. Para persistência de dados foi utilizado um banco de dados NoSQL, o MongoDB, que deu suporte à dinamicidade e desempenho que o serviço precisava. As informações persistidas incluı́ram as polı́ticas, as relações de herança e os dados de uso das máquinas. Do lado do agente também foi usado o mesmo framework, porém não como uma aplicação Web, mas sim, uma aplicação convencional executada como uma tarefa no sistema operacional hospedeiro, com periodicidade pré-definida. Essas tarefas são responsáveis por todas as atividades do agente, requisitando as variáveis a serem medidas, coletando dados de uso, efetuando a mediação e enviando os últimos ao serviço aCCountS. Para armazenamento temporário dos dados no agente, usou-se o banco de dados SQLite, de forma a tornar leve e rápido o processamento dos dados de uso em cada máquina virtual. O protótipo pode ser acessado no endereço http://accounts.zn.inf.br. Devido ao uso do framework Ruby on Rails, as principais ideias implantadas por ele foram seguidas no desenvolvimento do protótipo, como: (i) DRY (don’t repeat yourself ), estimulando a reutilização de código entre componentes, e (ii) Convention over Configuration (Convenções sobre Configurações), de forma a minimizar a necessidade de configurações para o projeto funcionar, precisando apenas seguir as recomendações do framework (como nomes de arquivos, localização de pastas, etc.). Da mesma forma, alguns padrões de projeto foram adotados durante o desenvolvimento, tanto por o framework influenciar nesse uso, quanto por decisões de projeto para o funcionamento da arquitetura do protótipo. São eles: (i) o padrão MVC (model-view-controller), usado na estruturação das camadas da aplicação, tanto no serviço quando no agente; (ii) REST (Representational State Transfer), usado no serviço para a disponibilização de recursos através de rotas para acesso do agente; (iii) Singleton, principalmente no agente, que deve ter referência única para o recebimento de informações, monitoramento da máquina e envio de dados para o serviço; (iv) Template Method, tendo seu uso nas classes geradas pelo compilador da DSL, de forma a ter-se métodos genéricos que se adaptam de acordo com o código gerado e (v) Observer, acoplado ao padrão MVC e, assim, utilizado na arquitetura do sistema para disparar métodos em caso de mudança de estado em objetos. A Figura 4 ilustra uma funcionalidade: a tela de manipulação das máquinas ou instâncias do aCCountS-Service. Na parte superior da tela está disposto um menu com opções para o gerenciamento de recursos (variables), perfis (profiles), polı́ticas (policies) e máquinas (machines). Para cada uma destas opções é possı́vel criar, alterar e remover itens. No caso da tela apresentada, o administrador da provedora de nuvens pode cadastrar instâncias, e associá-las a uma polı́tica de tarifação e um perfil criados anteriormente, e disponibilizá-las aos clientes. O mesmo escolhe, então, uma para ser utilizada na tarifação de sua máquina virtual em que o perfil de hardware corresponda suas necessidades. Em nossos testes foram implantadas máquinas virtuais em diferentes provedoras de nuvem, cada uma associada a uma polı́tica de tarifação diferente. Por meio desses testes, buscouse mostrar que o fluxo de tarifação entre os dois componentes (agente e servidor) ocorre de forma correta. Para verificar se a tarifação também estava correta, foi realizado o envio de dados de medição fictı́cios ao serviço, que permitiu comparar os resultados calculados pelo aCCountS com os valores esperados como resposta. A. Avaliação do aCCountS Para validação do fluxo de tarifação foram criadas três máquinas virtuais na Amazon e três máquinas virtuais no Windows Azure, onde foi implantado o aCCountS-Agent. Associadas a essas máquinas, foram criadas três combinações de polı́tica/perfil, de modo que se utilizou as mesmas configurações nas duas nuvens. As polı́ticas definidas para cada nuvem e seus perfis correspondentes são: (i) simplesUsoSobPos, com Garantia de alocação sob-demanda, Modelo de pagamento pós-pago, Perfil de máquina pequena, Forma de contabilização por uso de CPU, memória e armazenamento (Uso CPU/h: $ 0.006, Uso memoria/h: $ 0.006, Uso armazenamento/h: $ 0.006) e taxa de uso da máquina ou centro de dados $ 0.14; (ii) simplesTempoSobPos, com Garantia de alocação sob-demanda, Modelo de pagamento pós-pago, Perfil de máquina pequena, Forma de contabilização por tempo de uso da máquina (Tempo de uso da máquina/h: $ 0.06); e (iii) simplesUsoRes1Pos, com Garantia de alocação reservada 1 ano (taxa fixa: $ 0.0014), Modelo de pagamento pós-pago, Perfil de máquina pequena, Forma de contabilização por uso de CPU, memória e armazenamento (Uso CPU/h: $ 0.0034, Uso memória/h: $ 0.0034, Uso armazenamento/h: $ 0.0034) e taxa de uso da máquina ou centro de dados $ 0.14. No experimento, as seis máquinas virtuais foram executadas do dia 26 de junho de 2013, às 12h e 05 minutos até o dia 03 de julho, às 4h e 05 minutos e seus registros de medição (calculados a cada hora) utilizados para validação do fluxo de tarifação. Na tabela I são mostrados seis dos 73 Tabela I. R ESULTADOS DOS EXPERIMENTOS REALIZADOS NA Amazon EC2 E Windows Azure Polı́tica (Nuvem) simplesUsoSobPos (Amazon) simplesUsoRes1Pos (Amazon) simplesTempoSobPos (Amazon) simplesUsoSobPos (Azure) simplesUsoRes1Pos (Azure) simplesTempoSobPos (Azure) Registro de Medição {“cpu”: 0.01545, “memoria”: 0.80414, “armazenamento”: 2.7e-05} `‘cpu”=¿0.075, “memoria”=¿0.8302205, “armazenamento”=¿4.1e-05} {“tempoUso”: 1.0} Reg. Cobrança $ 0.00492 {“cpu”: 0.06428, “memoria”: 0.79215, “armazenamento”: 0.05129} {“cpu”=¿0.06851851851851852, “memoria”=¿0.5443499818181821, “armazenamento”=¿0.05128383636363637} {“tempoUso”:1.0} $ 0.00545 $ 0.00448 $ 0.06000 $ 0.00366 $ 0.06000 960 registros de medição recebidos pelo aCCountS-Service (três de cada provedora de nuvem) e os registros de cobrança calculados. Dessa forma, verificou-se a corretude ao passo que os recursos medidos nas máquinas foram os previamente definidos no aCCountS-Service e os registros de faturamento foram calculados conforme as definições nas polı́ticas e a partir dos dados enviados pelo agente. B. Avaliação da aCCountS-DSL Para validação da linguagem aCCountS-DSL foram criadas dez polı́ticas de tarifação, combinando os diferentes requisitos apontados nesse trabalho. Cada polı́tica de tarifação foi testada com três conjuntos de valores fixos dos recursos associados a três casos de teste para o sistema. Desta forma, conhecia-se de antemão o valor esperado da fatura. Por meio de um script, esses dados foram enviados para o aCCountS-Service realizar a tarifação por meio das polı́ticas definidas no aCCountS-DSL. No experimento foram utilizados três casos de testes (conjuntos de valores fixos dos recursos, os registros de medição) para verificar a corretude da tarifação em relação às diferentes polı́ticas definidas. Estes casos de teste estão definidos na Tabela II. Tabela II. Casos de teste CT1 CT2 CT3 R EGISTROS FIXOS USADOS COMO CASOS DE TESTE Registro { “economiaEnergia”: 0.3, “memoria”: 0.5, “cpu”: 0.5, “armazenamento”: 0.5, “transacaoBD”: 0.5, “upload”: 0.5, “software”: 0.07, “servico”: 0.07, “sla”: 0.03} {“economiaEnergia”: 0.5, “memoria”: 0.8, “cpu”: 0.8, “armazenamento”: 0.8, “transacaoBD”: 0.8, “upload”: 0.8, “software”:0.07, “servico”: 0.07, “sla”:0.0} {“economiaEnergia”: 0.0, “memoria”: 0.3, “cpu”: 0.3, “armazenamento”: 0.3, “transacaoBD”: 0.3, “upload”: 0.3, “software”: 0.14, “servico”: 0.07, “sla”: 0.0} Na Tabela III podem ser observados os resultados obtidos a partir da aplicação dos registros especificados como casos de testes mostrados na Tabela II em cada uma das polı́ticas descritas anteriormente. Tabela III. Polı́tica SobMedUsoPosPlus Res1MedUsoPosPlus SobPeqUsoPosPlus SobMedTmpPosPlus R ESULTADOS OBTIDOS NO EXPERIMENTO Caso de Previsto 0.30070 0.28949 0.28615 0.25220 Teste 1 Obtido 0.30070 0.28949 0.28615 0.25220 Caso de Teste 2 Esperado Obtido 0.30504 0.30504 0.28710 0.28710 0.28272 0.28272 0.25220 0.25220 Caso de Teste 3 Esperado Obtido 0.36800 0.36800 0.36160 0.36160 0.35900 0.35900 0.33000 0.33000 Pelo resultado do experimento, constatou-se que em todos os casos de teste, o serviço calcula o valor de tarifação como Fig. 4. Screenshot do protótipo da arquitetura aCCountS - associação entre clientes, perfis e polı́ticas esperado. Desta forma, conclui-se que o cálculo da fatura é realizado de forma correta, e que a linguagem aCCountS-DSL pode ser utilizada com segurança para tarifação de serviços de infraestrutura em computação em nuvem, definindo regras flexı́veis. VI. [2] [3] C ONCLUS ÕES E T RABALHOS F UTUROS A tarifação de recursos é uma caracterı́stica fundamental da computação em nuvem. Através de um levantamento bibliográfico foi possı́vel verificar que existem muitos desafios de pesquisa relacionados ao tema. Guiado por estes estudos, este trabalho busca melhorar a flexbilidade na definição de como são definidos e tarifados os recursos oferecidos por provedores de nuvens de infraestrutura. Foram definidos e apresentados (i) uma arquitetura para sistemas de tarifação em nuvem independente de um provedor especı́fico, (ii) o aCCountS, serviço de tarifação desenvolvido a partir da arquitetura proposta e (iii) uma linguagem de tarifação flexı́vel, a aCCountSDSL, que permite definir polı́ticas de cobrança flexı́veis para atender diferentes perfis de clientes (necessidades, recursos, perspectivas) e diferentes objetivos das provedoras de nuvem (lucro, bem estar social, preço justo). Com os experimentos realizados no aCCountS e na linguagem aCCountS-DSL, foi mostrado que a partir da arquitetura proposta é possı́vel desenvolver um sistema capaz de tarifar serviços de infraestrutura em nuvem de forma correta e independente. Com base na DSL apresentada, pode-se criar polı́ticas de tarifação flexı́vel e de fácil modificação. Dessa forma, esse trabalho atendeu ao seu objetivo, trazendo uma solução para os problemas de tarifação apresentados e, com isso, pôde contribuir com a área de tarifação em computação em nuvem. Como trabalhos futuros pode-se citar a extensão do serviço para atender requisitos de segurança, disponibilidade e escalabilidade, que são de suma importância para utilização de serviço de tarifação em nuvem. [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] R EFER ÊNCIAS [1] L. M. Vaquero, L. Rodero-Merino, J. Caceres, and M. Lindner, “A break in the clouds: towards a cloud definition,” Rev. Commun. SIGCOMM Comput., vol. 39, no. 1, pp. 50–55, December 2008. [Online]. Available: http://doi.acm.org/10.1145/1496091.1496100 74 M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. H. Katz, A. Konwinski, G. Lee, D. A. Patterson, A. Rabkin, I. Stoica, and M. Zaharia, “Above the clouds: A berkeley view of cloud computing,” EECS Department, University of California, Berkeley, Tech. Rep., Fevereiro 2009. E. da Silva and D. Lucredio, “Software engineering for the cloud: A research roadmap,” in XXVI Brazilian Symposium on Software Engineering (SBES 2012), Natal-RN, Setember 2012, pp. 71–80. I. Ruiz-Agundez, Y. Penya, and P. Bringas, “A flexible accounting model for cloud computing,” in Annual SRII Global Conference (SRII 2011), San Jose, USA, March 2011, pp. 277–284. F. A. P. da Silva, P. Silveira, V. Garcia, R. Assad, and F. Trinta, “Accounting models in cloud and grid computing: A systematic mapping study,” in VIII International Conference on Grid Computing and Applications, ser. GCA ’12, Las Vegas, United States, July 2012, p. 7. A. Caracas and J. Altmann, “A pricing information service for grid computing,” in Proceedings of the 5th international workshop on Middleware for grid computing: held at the ACM/IFIP/USENIX 8th International Middleware Conference, ser. MGC ’07. New York, NY, USA: ACM, November 2007, pp. 4:1–4:6. [Online]. Available: http://doi.acm.org/10.1145/1376849.1376853 A. Narayan, S. Rao, G. Ranjan, and K. Dheenadayalan, “Smart metering of cloud services,” in IEEE International Systems Conference (SysCon 2012), Ottawa, Ontario, Canada, March 2012, pp. 1–7. H. S. Group, “Business rule management system,” Website, 2012, http://www.hartmannsoftware.com/pub/Enterprise-RuleApplications/brms. Amazon Web Service, “Amazon elastic compute cloud (amazon EC2),” Website, 2012, http://aws.amazon.com/pt/ec2/. M. Dantas, ”A lógica do capital-informação”, Contraponto, Ed., 2002. E. Elmroth, F. Marquez, D. Henriksson, and D. Ferrera, “Accounting and billing for federated cloud infrastructures,” in 8th International Conference on Grid and Cooperative Computing (GCC 2009), Lanzhou, China, August 2009, pp. 268–275. I. Ruiz-Agundez, Y. K. Penya, and P. G. Bringas, “A taxonomy of the future internet accounting process,” IV International Conference on Advanced Engineering Computing and Applications in Sciences (ADVCOMP 10), p. 7, October 2010, florence, Italy. Y. Yu and S. Bhatti, “Energy measurement for the cloud,” in International Symposium on Parallel and Distributed Processing with Applications (ISPA 2010), Taipei, Taiwan, september 2010, pp. 619– 624. T. Imada, M. Sato, and H. Kimura, “Power and qos performance characteristics of virtualized servers,” in X IEEE/ACM International Conference on Grid Computing - GRID 2009, Banff, Alberta, Canada, October 2009, pp. 232–240. AspectJ-based Idioms for Flexible Feature Binding Rodrigo Andrade∗ , Henrique Rebêlo∗ , Márcio Ribeiro† , Paulo Borba∗ ∗ Informatics Center, Federal University of Pernambuco Email: rcaa2,hemr,[email protected] † Computing Institute, Federal University of Alagoas Email: [email protected] Abstract—In Software Product Lines (SPL), we can bind reusable features to compose a product at different times, which in general are static or dynamic. The former allows customizability without any overhead at runtime. On the other hand, the latter allows feature activation or deactivation while running the application with the cost of performance and memory consumption. To implement features, we might use aspect-oriented programming (AOP), in which aspects enable a clear separation between base code and variable code. In this context, recent work provides AspectJ-based idioms to implement flexible feature binding. However, we identified some design deficiencies. Thus, to solve the issues of these idioms, we incrementally create three new AspectJ-based idioms. Moreover, to evaluate our new idioms, we quantitatively analyze them with respect to code cloning, scattering, tangling, and size by means of software metrics. Besides that, we qualitatively discuss our new idioms in terms of code reusability, changeability, and instrumentation overhead. Keywords: Software Product Lines; Aspect-Oriented Programming; Idioms; Flexible Feature Binding. I. I NTRODUCTION A Software Product Line (SPL) is a family of softwareintensive systems developed from reusable assets. By reusing such assets, it is possible to construct a large number of different products applying compositions of different features [1]. Depending on requirements and composition mechanisms, features should be activated or deactivated at different times. In this context, features may be bound statically, which could be, for instance, at compile time or preprocessing. The benefit of this approach is to facilitate the applications’ customizability without any overhead at runtime [2]. Therefore, this static feature binding is suitable for applications running on devices with constrained resources, such as certain mobile phones. On the other hand, features may be bound dynamically (e.g. at run or link time) to allow more flexibility, with the cost of performance and memory consumption. Furthermore, if the developers do not know, before runtime, the set of features that should be activated, they could use dynamic feature binding to activate features on demand. To support flexible binding for feature code implemented using aspects [3], which we focus on this work, we proposed Layered Aspects [4]. This solution reduces several problems identified in a previous work [5], such as code cloning, scattering, and tangling [4], [6]. Although these goals are achieved to some extent, Layered Aspects still presents some deficiencies. It may introduce feature code scattering and instrumentation overhead to the flexible feature binding implementation. Additionally, applying Layered Aspects demands several changes, which could hamper the reuse of the flexible feature binding implementation. Hence, to address the Layered Aspects issues and still have low rates of code cloning, scattering, and tangling, we 75 define three new idioms based on AspectJ [7], which we call increments, as they incrementally address the Layered Aspects issues. In our context, we use the terminology idiom instead of pattern because our increments are more AspectJ specific and address a smaller and less general problem than a pattern. The first increment addresses part of the issues with the aid of Java annotations. The second increment uses the @AspectJ syntax [8] to address more Layered Aspects issues, although, because this syntax does not support intertypes, it may introduce problems, such as feature code scattering. In this context, due to AspectJ traditional syntax limitations, our final idiom uses an internal resource of AspectJ’s compiler to address all the issues without introducing @AspectJ syntax problems. To evaluate these idioms, we extract the code of a total of 13 features from four different product lines and we apply each new idiom plus Layered Aspects to implement flexible binding for these features. Then, to evaluate whether our new idioms do not present worse results than Layered Aspects with respect to code cloning, scattering, tangling, and size, we quantitatively assess the idioms by means of software metrics. To this end, we use five metrics: Pairs of Cloned Code, Degree of Scattering across Components [9], Degree of Tangling within Components [9], Source Lines of Code, and Vocabulary Size. Additionally, we discuss the four idioms regarding three factors: their code reusability, changeability, and instrumentation overhead based on our four product lines and also on our previous knowledge about this topic [4], [6]. As result of this evaluation, we conclude that our final new idiom incrementally addresses these three factors and does not present worse results regarding the software metrics. In summary, the contributions of this paper are: • • • • • • We identify deficiencies in an existent idiom to implement flexible feature binding; We address these deficiencies by incrementally defining three new idioms to implement flexible feature binding; We apply these four idioms to provide flexible binding for 13 features from four case studies; We quantitatively evaluate the three new idioms plus the existent one with respect to code cloning, scattering, tangling, and size by means of software metrics; We discuss the idioms regarding reusability, changeability, and code instrumentation overhead; We provide the source code and sheet used in this paper [10]. At last, we structure the remainder of this paper as follows. In Section II, we present the motivation of our work, detailing the Layered Aspects issues. Section III introduces our three new idioms to address these issues. In Section IV, we present responsible for dynamically activating or deactivating feature code execution. It may vary from a simple user interface prompt to complex sensors, which decide by themselves whether the feature should be activated [6]. In our case, the driver mechanism reads a property value from a properties file. For instance, if we want to dynamically activate the Total feature, we would set total=true in the properties file. We do this for simplicity, since the complexity about providing information for feature activation is out of the scope of this work. To implement dynamic binding for the Total feature, we define TotalDynamic, as showed in Listing 3. Line 3 defines an if pointcut to capture the driver’s value. To allow dynamic feature binding, Lines 5-8 define an adviceexecution pointcut to deal only with before and after advice. Thus, it is possible to execute those pieces of advice defined in TotalFeature aspect (Listing 1) depending on the driver’s value. For instance, the feature code within a before or after advice in TotalFeature aspect is executed if the driver condition is set to true in Line 3 of Listing 3. In this case, the adviceexecution pointcut does not match any join point in TotalFeature because the driver is negated in Line 6, and therefore, the feature code is executed. On the other hand, if the driver condition is false, the adviceexecution pointcut matches some join points. However, feature code is not executed because we do not call proceed. Additionally, returning null in Line 7 is not harmful when the feature is deactivated because Layered Aspects does not use the adviceexecution pointcut for around advice [4]. the evaluation of Layered Aspects and our three new idioms regarding code cloning, scattering, tangling, and size as well as a qualitative discussion. Finally, Section V presents the threats to validity, Section VI discusses related work, and Section VII concludes this work. II. M OTIVATING E XAMPLE This section presents the Layered Aspects issues by showing the implementation of flexible binding for the Total optional feature of the 101Companies SPL. This product line is based on a Java version of the 101Companies project [11]. In this context, the Total feature represents the total salary of a given employee, the sum of all department salaries, or the sum of all company salaries. We do not use the same application from our previous work [4] as a toy example because we believe that the 101Companies project is best known nowadays, which makes the understanding easier. Besides that, the application we used before has not been supported for a long time1 . As mentioned in the previous section, to implement flexible binding time, we could use the Layered Aspects idiom, which makes it possible to choose between static (compile time) and dynamic (runtime) binding for features. Basically, the structure of this idiom includes three aspects. One abstract aspect implements the feature code whereas two concrete subaspects implement static and dynamic feature binding. Using the 101Companies Total feature as an example, in Listing 1, we illustrate part of this feature code. It consists of pointcuts (Line 3), advice (Line 6), intertype declarations (Line 11), and private methods, which we omit for simplicity. Albeit we do not show all the code in this paper, we provide it elsewhere [10]. To apply Layered Aspects, we need to change the TotalFeature aspect by including the abstract keyword in Line 1. This allows the concrete subaspect to inherit from TotalFeature, since only abstract aspects can be inherited in AspectJ [8]. Listing 1: TotalFeature aspect 1 privileged aspect TotalFeature { 2 3 pointcut newAbstractView ( AbstractView c t h i s ) : 4 e x e c u t i o n ( A b s t r a c t V i e w . new ( . . ) ) && t h i s ( c t h i s ) ; 5 6 v o i d around ( A b s t r a c t V i e w c t h i s ) : n e w A b s t r a c t V i e w ( c t h i s ) { 7 proceed ( c t h i s ) ; 8 c t h i s . t o t a l = new J T e x t F i e l d ( ) ; 9 } 10 11 p r i v a t e J T e x t F i e l d A b s t r a c t V i e w . t o t a l ; 12 ... 13 } Listing 3: Layered Aspects TotalDynamic aspect 1 aspect TotalDynamic extends T o t a l F e a t u r e { 2 3 pointcut driver () : i f ( Driver . isActivated ( " t o t a l " ) ) ; 4 5 O b j e c t around ( ) : a d v i c e e x e c u t i o n ( ) && w i t h i n ( T o t a l F e a t u r e ) 6 && ! d r i v e r ( ) { 7 return null ; 8 } 9 10 p o i n t c u t n e w A b s t r a c t V i e w ( A b s t r a c t V i e w c t h i s ) : 11 T o t a l F e a t u r e . n e w A b s t r a c t V i e w ( c t h i s ) && d r i v e r ( ) ; 12 } To implement static binding, we define TotalStatic, which is an empty concrete subaspect that inherits from TotalFeature aspect, as we illustrate in Listing 2. Thus, we are able to statically activate the feature execution by including both aspects in the project build. Listing 2: TotalStatic subaspect 1 aspect T o t a l S t a t i c extends T o t a l F e a t u r e {} Before explaining the dynamic feature activation or deactivation, we first need to introduce an important concept used throughout this paper: the driver [4]. This is the mechanism 1 http://kiang.org/jordan/software/tetrismidlet/ 76 Thereby, Layered Aspects design states that the pieces of around advice of the feature code must be deactivated one-by-one because the adviceexecution pointcut could lead to problems when the driver states the feature deactivation [4]. For such scenario, we would miss the base code execution, since the around advice matched by the adviceexecution would not be executed and consequently, the proceed() of the around advice would not be executed either, which leads to missing the base code execution that is independent of the activation or deactivation of features. Thus, to avoid this problem, Layered Aspects associates the driver with each pointcut related to an around advice defined in TotalFeature as showed in Lines 10 and 11. These lines redefine the newAbstractView pointcut and associate it with the driver. Thus, the code within the around advice defined in Listing 1 is executed only if the driver’s value is true, that is, the feature is activated. The redefinition of pointcuts for such cases is the reason why the TotalDynamic needs to inherit from TotalFeature [4], and consequently the latter needs to be an abstract aspect, since AspectJ does not provide a way to inherit from a concrete aspect. In this context, we may observe three main issues when applying Layered Aspects to implement flexible feature binding. First, the adviceexecution pointcut unnecessarily matches all pieces of advice within the feature code, including around advice. As mentioned, the adviceexecution is used only for before and after advice. This issue may cause overhead in byte code instrumentation. Additionally, returning null within adviceexecution pointcut is not a very elegant solution, even though this situation is not errorprone, as mentioned. The second issue is the empty concrete subaspect to implement static feature binding. We have to define it due to the AspectJ limitation, in which an aspect can inherit from another only if the latter is abstract. So this subaspect is imperative for static feature activation, since it is necessary to allow feature code instantiation. This may increase code scattering because we need an empty subaspect for each abstract aspect that implements feature code. For instance, we had to implement 15 empty concrete aspects to implement static binding for our 13 selected features. Another issue is the pointcut redefinition, which is applied when a pointcut within the feature code is related to an around advice. In this context, if there are a large number of around advice, we would need to redefine each pointcut related to them, which could lead to low productivity or even make the task of maintaining such a code hard and errorprone. Besides that, this could hinder code reusability and changeability. It may hinder the former because these pointcut redefinitions may hamper reusing the flexible binding implementation. Additionally, it hinders the latter when introducing a new driver for example, as we would need to associate the driver to each pointcut. Hence, we enumerate the main goals we try to address in this work: 1. To prevent adviceexecution pointcut to unnecessarily match around advice; 2. To avoid the empty concrete subaspect to implement static binding; 3. To eliminate the need of redefining each pointcut related to an around advice within the concrete subaspect to implement the dynamic binding. We believe that defining new idioms that address these issues may bring benefits, such as code scattering reduction, increase of reusability and changeability, and decrease of instrumentation overhead. We discuss these improvements throughout the next sections. A. First increment For this increment, we try to prevent adviceexecution pointcut to match around advice within feature code, which corresponds to the first issue. To achieve that, we use an AspectJ 5 mechanism, which includes the support for matching join points based on the presence of Java 5 annotations [8]. Listing 4: Around advice annotated 1 abstract privileged aspect TotalFeature { 2 ... 3 @AroundAdvice 4 v o i d around ( A b s t r a c t V i e w c t h i s ) : n e w A b s t r a c t V i e w ( c t h i s ) { 5 proceed ( c t h i s ) ; 6 c t h i s . t o t a l = new J T e x t F i e l d ( ) ; 7 } 8 } In this context, we create an AroundAdvice annotation and we use it to annotate all pieces of around advice within the feature code, as showed in Line 3 of Listing 4. Hence, we can prevent adviceexecution pointcut to match any of these annotated advice when applying dynamic binding. To implement the static feature binding, we include the TotalFeature and TotalStatic aspects plus the Total class in the project build. In its turn, to implement the dynamic feature binding, we change the adviceexecution pointcut by adding the !@annotation(AroundAdvice) clause. Thus, this pointcut does not match the pieces of around advice defined in TotalFeature. In Listing 5, we show the adviceexecution pointcut with the !@annotation(AroundAdvice) clause, which is the part that differs from Listing 3. Therefore, we resolve the first Layered Aspects issue. However, the other two issues remain open. To address them, we continue demonstrating more increments next. Listing 5: TotalDynamic aspect with the first increment 1 aspect TotalDynamic extends T o t a l F e a t u r e { 2 ... 3 v o i d around ( ) : a d v i c e e x e c u t i o n ( ) && w i t h i n ( T o t a l F e a t u r e ) 4 && !@annotation(AroundAdvice) { 5 i f ( D r i v e r . i s A c t i v a t e d ( " t o t a l " ) ) { proceed ( ) ; } 6 } 7 } III. N EW IDIOMS In this section, we illustrate our three new idioms. To perform this, we apply each idiom to implement flexible binding for the Total feature from the 101Companies SPL. We point out the advantages and disadvantages of each increment and how they address the issues presented in Section II. The last increment corresponds to the AroundClosure in which we address all the issues. Moreover, for the examples in the following sections, we consider the same 101Companies SPL source code. More specifically, we replicate this source code so that we could apply each idiom for the code of its features. 77 B. Second increment For this increment, we try to address the second and third Layered Aspects issues, which correspond to avoiding the empty concrete subaspect to implement static binding and to eliminating the need of redefining each pointcut related to around advice, as explained in Section II. To achieve that, we use the new @AspectJ syntax [8], which offers the option of compiling source code with a plain Java compiler. This syntax demands that the feature code elements are annotated with provided annotations, such as @Aspect, @Pointcut, @Around, @Before, and @After. Listing 6 shows part of the TotalFeature class, which contains feature code similarly to Listing 1. The main differences are the annotations in Lines 1, 4, and 7, which are used in collusion with their parameters to define an aspect, pointcut, and advice, respectively. However, the @AspectJ syntax presents some disadvantages. First, there is no way to declare a privileged aspect [8], which is necessary to avoid creating an access method or changing base code element’s visibility, such as handling [8]. Furthermore, the pointcut and advice definitions within the annotation statement are verified only at weaving time rather than compile time with the traditional syntax. This could hamper code maintenance and error finding. Therefore, in the next increment, we try to keep addressing the three Layered Aspects issues without using the @AspectJ syntax. changing from private to public to be visible within TotalFeature class. Indeed, we had to change or add get methods for eight program elements only within the Total feature code. Second, this new syntax does not support intertype declarations [8]. Therefore, we need to define an additional aspect, which is implemented with the traditional AspectJ syntax, containing the needed intertype declarations. C. Final increment: AroundClosure Now, we improve our previous increment by addressing all the three Layered Aspects issues presented in Section II, but without introducing the @AspectJ syntax deficiencies. To achieve that, we still need to avoid these three issues and use the traditional AspectJ syntax. The AroundClosure idiom does not demand any changes in the feature code implementation showed in Listing 1. Thus, to provide flexible binding to the Total feature with AroundClosure, we need Total class plus the TotalFeature, and TotalDynamic aspects, as showed in Listing 1, and 8, respectively. In this context, since TotalFeature is not an abstract aspect like in Layered Aspects or our first increment, it is not necessary to have an empty abstract aspect to implement static feature binding. We just include the TotalFeature aspect and Total class in the project build to statically activate the Total feature. Further, to implement the dynamic feature binding, we define the TotalDynamic aspect, as illustrated in Listing 8. We define a generic advice using adviceexecution pointcut that works with before, after, and around advice. Hence, we do not need to redefine each pointcut within the feature implementation that is related to an around advice. Thereby, TotalDynamic does not extend TotalFeature, so the abstract aspect is no longer needed. More specifically, to deal with dynamic feature binding, we just call proceed() in Line 4, so the feature code within the advice defined in TotalFeature is executed normally. We have to define the around advice as returning an Object in Line 2 to make it generic, avoiding compilation errors when an around advice, that is not void, is present in the feature implementation. Listing 6: Total feature with the second increment 1 @Aspect 2 class TotalFeature { 3 ... 4 @Pointcut("execution(AbstractView.new(..)) && this(cthis)") 5 p u b l i c v o i d n e w A b s t r a c t V i e w ( A b s t r a c t V i e w c t h i s ) {} 6 7 @Around("newAbstractView(cthis)") 8 void around1 ( AbstractView c t h i s , P r o c e e d i n g J o i n P o i n t pjp ) { 9 pjp . proceed ( ) ; 10 c t h i s . t o t a l = new J T e x t F i e l d ( ) ; 11 } 12 } Despite these limitations, we could eliminate the empty concrete aspect to implement the static feature binding. Since TotalFeature of Listing 6 is a class rather than an abstract aspect, we are able to instantiate it without the concrete subaspect. In this way, to statically activate Total feature, we need to include the TotalFeature and Total classes, and the TotalFeatureInter aspect, which is the aspect containing intertype declarations, as explained. To implement the dynamic feature binding, we use an adviceexecution pointcut, which matches before, after, and around advice. Hence, we do not need to redefine pointcuts related to around advice. Therefore, we address the third Layered Aspects issue. Listing 7 illustrates how this increment deals with dynamic feature binding. Lines 4-14 define an adviceexecution pointcut using the @AspectJ syntax in a similar way to the one defined in Listing 3. Besides the syntax, the difference is dealing with scenarios that the feature is dynamically deactivated. Thus, we define the proceedAroundCallAtAspectJ method in a separate class and call it in Line 10, which allows us to call the proceed join point of the matched pieces of advice defined within TotalFeature. Hence, even if the Total feature is dynamically deactivated, the execution of other functionalities are not compromised [4]. Listing 7: TotalDynamic class with second increment 1 @Aspect 2 public c l a s s TotalDynamic { 3 @Around ( " a d v i c e e x e c u t i o n ( ) && w i t h i n ( T o t a l F e a t u r e ) " ) 4 public Object adviceexecutionIdiom ( JoinPoint thisJoinPoint , 5 ProceedingJoinPoint pjp ) { 6 Object r e t ; 7 i f ( Driver . isActivated ( " t o t a l " )) { 8 ret = pjp.proceed(); 9 } else { 10 ret = Util.proceedAroundCallAtAspectJ(thisJoinPoint); 11 } 12 return r e t ; 13 } 14 } Listing 8: TotalDynamic aspect with AroundClosure 1 aspect TotalDynamic { 2 O b j e c t around ( ) : a d v i c e e x e c u t i o n ( ) && w i t h i n ( T o t a l F e a t u r e ) { 3 i f ( Driver . isActivated ( " t o t a l " )) { 4 return proceed ( ) ; 5 } else { 6 return U t i l . proceedAroundCall ( t h i s J o i n P o i n t ) ; 7 } 8 } 9 } Albeit we address the three Layered Aspects issues with our second increment, it still presents some undesired points. First, the @AspectJ syntax is limited: it does not support privileged aspects, intertype declarations, and exception 78 On the other hand, it is not trivial to deal with the scenario in which the feature is dynamically deactivated due to around advice. This kind of advice uses a special form (proceed) to continue with the normal base code flow of execution at the corresponding join point. This special form is implemented by generating a method that takes in all of the original arguments to the around advice plus an additional AroundClosure object that encapsulates the base code flow of execution [12], which has been interrupted by the pieces of advice related to the feature and afterwards interrupted by the adviceexecution pointcut. Thus, in Line 6, we call the proceedAroundCall method passing as argument thisJoinPoint, which contains reflective information about the current join point of the TABLE I: Case study and features feature code advice that adviceexecution is matching. Case study Freemind ArgoUML 101Companies Listing 9: The proceedAroundCall method 1 s t a t i c Object proceedAroundCall ( JoinPoint t h i s J o i n P o i n t ) { 2 ... 3 Object [ ] args = t h i s J o i n P o i n t . getArgs ( ) ; 4 i n t i = ( args . l e n g t h − 1 ) ; 5 i f ( args [ i ] i n s t a n c e o f AroundClosure ) { 6 return ( ( AroundClosure ) args [ i ] ) . run ( args ) ; 7 } 8 } BerkeleyDB To avoid missing the base code flow of execution when the feature is dynamically deactivated, Listing 9 defines part of the proceedAroundCall method. First, we obtain an array with the arguments of the matched advice through the thisJoinPoint information in Line 3. By means of this array we obtain the AspectJ AroundClosure object. Thus, we directly call the AroundClosure method run in Line 6, which executes the base code. This run method is automatically called under the hood by the proceed of each around advice. However, since we miss this proceed when the feature is dynamically deactivated, we need to manually call run so that we do not miss the base code execution. As explained, this idiom uses the AroundClosure object, which is an internal resource of AspectJ’s compiler. Therefore, to the correct operation of this idiom, the AroundClosure object must be present in the compiler. Albeit our focus is only AspectJ, other AOP-based compilers also include this object [13], [14]. At last, by means of our new AroundClosure idiom, we address the Layered Aspects issues without introducing the @AspectJ syntax problems. Therefore, we recommend developers to use AroundClosure instead of LayeredAspects, although the other increments could be used as idioms as well. Furthermore, AroundClosure brings some advantages with respect to code reusability, changeability, and byte code instrumentation. Besides that, it does not worsen the metrics results with respect to code cloning, scattering, tangling, and size. We discuss these matters in Section IV. IV. E VALUATION In this section, we explain our evaluation. In Section IV-A, we quantitatively evaluate our new idioms and Layered Aspects in a similar way we did in our previous work [4] to avoid bias. Besides that, we discuss our three new idioms and Layered Aspects regarding code reusability, changeability, and instrumentation overhead in Section IV-B. In this context, we consider 13 features of four case studies: two features of 101Companies [11], eight features of BerkeleyDB [15], one feature of ArgoUML [16], and two features of Freemind [17]. Besides 101Companies, the other three case studies are the same from our previous work [4]. This is important to show the gains obtained with the new idioms on the top of the same features. In this way, we avoid biases such as implementing flexible binding for feature that present different degree of scattering or tangling. In Table I, we map the 13 features to the respective case study. These case studies represent different sizes, purposes, architectures, granularity, and complexity. Moreover, the code of their features present different types regarding feature model, such as optional, alternative, and mandatory features [18]. To perform our evaluation, we follow four main procedures. However, we only execute the third and fourth procedures for the BerkeleyDB case study, as we discuss in Section V-A. 79 Features Icons and Clouds Guillemets Total and Cut EnvironmentLock, Checksum, Delete, LookAheadCache, Evictor, NIO, IO, and INCompressor First, to create the product lines from the original code of these case studies, we assigned the code of their features by using the prune dependency rules [19], which state that "a program element is relevant to a feature if it should be removed, or otherwise altered, when the feature is pruned from the application". By following these rules, we could identify all the code related to the features. We choose this rule to reduce introducing bias while identifying feature code. Second, we extracted part of this code into AspectJ aspects. We say part because some feature code do not need to be extracted into aspects, as it could be localized in classes. The code within these classes is not extracted into aspects because the whole class is only relevant to the feature, so it should not exist in the base code by following the prune dependency rules. Additionally, there are references to the elements of these classes only within the feature code. Each feature code is localized in a different and unique package, which contains aspects and, possibly, classes. Third, to evaluate our three new idioms and Layered Aspects, we applied each one of our three new idioms plus Layered Aspects to implement flexible binding for the 13 features of the four case studies. For the 101Companies, we apply each one of our three new idioms plus Layered Aspects to implement flexible binding for two features. This product line has nearly 900 lines of code whereas 300 of feature code. For BerkeleyDB, we apply the four idioms to implement flexible binding for eight features of the BerkeleyDB product line [15]. This product line has around 32000 lines of code whereas the eight features sum up approximately 2300 lines of code. This allows us to test our new AroundClosure idiom and the increments in a large and wide used application. For ArgoUML, we create a product line by extracting the code of one feature into AspectJ aspects. Then, we apply the four idioms presented to implement flexible binding for these features. Our ArgoUML product line has nearly 113000 lines of code and 200 of feature code. For Freemind, we also extract the code of two features into AspectJ aspects. Then, we apply the four idioms to provide flexible feature binding for these two features. The Freemind product line has about 67000 lines of code and both features have approximately 4000 lines. Fourth, we collect the number of lines of code (LOC) of relevant components, such as feature or driver code, to provide as input to compute the metrics. We use the Google CodePro AnalytiX2 to obtain the LOC and we use sheets to auxiliary the computation of the metrics. Moreover, we detail the selected metrics and results in Section IV-A. At last, we provide all the source code and sheets elsewhere [10]. 2 https://developers.google.com/java-dev-tools/download-codepro A. Quantitative analysis To drive the quantitative evaluation of our idioms, we follow the Goal-Question-Metric (GQM) design [20]. We structure it in Table II. We use Pairs of Cloned Code in Section IV-A1 to answer Question 1, as it may indicate a design that could increase maintenance costs [21] because a change would have to be done twice to the duplicated code. To answer Question 2, we use Degree of Scattering across Components [9] in Section IV-A2 to measure the implementation scattering for each idiom regarding driver and feature code. To answer Question 3, we measure the tangling between driver and feature code considering the Degree of Tangling within Components [9] metric in Section IV-A3. Furthermore, Source Lines of Code and Vocabulary Size are well known metrics for quantifying a module size and complexity. So, in Section IV-A4, we answer Question 4 measuring the size of each idiom in terms of lines of code and number of components. Albeit we show only part of the graphs and data in this section, we provide them completely elsewhere [10]. TABLE II: GQM Goal Purpose Issue Object Viewpoint Questions and Metrics Q1- Do the new idioms increase code cloning? Pairs of Cloned Code Q2- Do the new idioms increase driver and feature code scattering? Degree of Scattering across Components Q3- Do the new idioms increase tangling between driver and feature code? Degree of Tangling within Components Q4- Do the new idioms increase lines of code and number of components? Source Lines of Code Vocabulary Size Evaluate idioms regarding cloning, scattering, tangling, and size of their flexible binding implementation for features from a software engineer viewpoint PCC DOSC DOTC SLOC VS 1) Cloning: To answer Question 1 and try to determine that our new idioms do not increase code cloning, we use the CCFinder [22] tool to obtain the PCC metric results. CCFinder is a widely used tool to detect cloned code [23], [24], [25]. Equally to our previous work [4], we use 12 as the token set size (TKS), and use 40 as the minimum clone length (in tokens), which means that to be considered cloned, two pairs of code must have at least 40 equal tokens. In general, the four idioms present similar results. There is no code replication for eight features out of 13 regarding the four idioms. Additionally, the idioms lead to low PCC rates for the code of these five features that present code replication [10]. Therefore, our new idioms do not increase code cloning. This is the answer for Question 1. 2) Scattering: To answer Question 2, we use DOSC to analyze feature and driver code scattering for each idiom. Feature and driver are different concerns, so we analyze them separately. Driver. In Figure 1, we present the results regarding the DOSC metric. The only idiom that presents driver scattering is 80 our new first increment. This occurs due to the annotations we must add to around advice defined within the feature code, as explained in Section III-A. This may hinder code reusability and changeability. However, our first increment reduces the byte code instrumentation, as we discuss in Section IV-B. Additionally, feature Cut does not present an around advice, therefore there is no AroundAdvice annotation in its code. The NIO and IO features only present one around advice each, thus there is no need to add the AroundAdvice annotation, as only one pointcut redefinition implements the driver. Fig. 1: DOSC for driver Furthermore, Layered Aspects [4] and our new second increment do not present any driver code scattering, since their driver are implemented within a unique separate component for each idiom. Specially, our AroundClosure idiom, which is the best solution proposed, do not present driver code scattering either due to the same reasons. Feature. Figure 2 illustrates the DOSC results. In this context, our new second increment presents a disadvantage comparing to the others. This happens because the @AspectJ syntax, which is used by the second increment, does not support intertype declarations. Thus, as explained in Section III-B, this idiom needs an additional AspectJ aspect (traditional syntax) to implement the intertype declarations, which contributes to scatter feature code across at least two components. On the other hand, features NIO and IO do not present intertype declarations within their implementation. Thus, our second increment does not scatter feature code in these cases. Differently from driver code scattering, the implementation of these two idioms are similar regarding feature code. Thus, our first increment and Layered Aspects present similar results. Additionally, the AroundClosure idiom only presents feature code scattering when more than one aspect is used to implement feature code, which is the case of the Delete feature. At last, we answer Question 2 saying that our first increment increases driver code scattering whereas our second increment increases feature code scattering. However, our final solution (AroundClosure) does not present driver scattering and lower scattering rates for feature. 3) Tangling: This section answers Question 3 by investigating the extent of tangling between feature and driver code. According to the principle of separation of concerns [26], one should be able to implement and reason about each concern independently. Equally to our previous work [4], we also assume that the greater is the tangling between feature code and its driver code, provide all data and graphs elsewhere [10]. Fig. 2: DOSC for feature the worse is the separation of those concerns. In this way, we measure the Degree of Tangling within Components (DOTC). In Figure 3, we show the DOTC metric results. Only our first increment presents tangling between two concerns: driver and feature. This happens due to the AroundAdvice annotation included within the aspects that implement feature code. On the other hand, our second increment and AroundClosure present no tangling between driver and feature code. For example, Listings 7 and 8 contain only driver code by following the prune dependency rule, that is, the code defined within TotalDynamic class and aspect is relevant only to driver concern. In this way, these idioms comply with the results obtained for Layered Aspects. At last, we conclude that our first increment increases the tangling between driver and feature code. However, our second increment and AroundClosure does not present tangling at all. This answers Question 3. Fig. 3: DOTC 4) Size: To identify the idiom that increases the size of its implementation, we try to answer Question 4, which is related to the size of each idiom in terms of lines of code and the number of components. For this purpose, we use the SLOC and VS metrics. In this context, the differences between the four idioms is insignificant for SLOC and VS metrics. For instance, the Icons feature presents between 2155 and 2186 source lines of code for the smallest and largest idiom implementation, respectively. This represents a difference of only 1.41% of the feature implementation. Similarly, the differences between the four idioms for the VS metric results are also insignificant. Therefore, we answer Question 4 stating that our new idioms do not increase lines of code and number of components. We 81 B. Qualitative discussion In this section, we qualitatively discuss Layered Aspects and our three new idioms in terms of code reusability, changeability, and instrumentation overhead. Several works choose similar approaches to qualitatively discuss their implementations [27], [28], [29], [30]. 1) Reusability: The reusability concerns to how easily we can reuse the flexible binding implementation using an idiom. Therefore, we are interested in checking what we need to do to reuse a given idiom code when applying it to another feature. Layered Aspects and First increment. We may have to perform several changes to reuse the code of the implementation of these idioms. Only if the features we aim at applying flexible binding do not present any around advice within its implementation, then we would perform few changes to reuse the code of these idioms between the features, since the adviceexecution pointcut is reused as it applies to all before and after advice. However, Layered Aspects and our first increment redefine the pointcuts related to around advice, which hinders reuse since these redefined pointcuts are associated to a particular feature. Hence, this compromises the overall reusability of the implementation of both idioms. Second increment and AroundClosure. Few changes are needed to reuse the code of both idioms. The adviceexecution pointcut matches all the pieces of advice within the feature implementation, it does not matter whether they are before, after, or around. Thus, our second increment and AroundClosure are easily reused, since the difference between one dynamic feature binding to another is only the aspect that the adviceexecution pointcut should apply (within clause in Listing 7 and 8) and the input to the driver. For example, if we want to apply the AroundClosure idiom to the Cut feature, we could reuse the code of this idiom used in Total feature. In Listing 8, we would alter TotalFeature to CutFeature in Line 2, which corresponds to the aspect that contains the Cut feature code and "total" to "cut" in Line 4, which represents the Cut feature property in the properties file used for the driver in our case. 2) Changeability: Changeability is related to the amount of changes we need to perform in the application or in the idiom to implement flexible feature binding. Hence, we are interested in how difficult or time consuming the task of applying a flexible feature binding implementation through an idiom is. Layered Aspects and First increment. Applying these idioms demands several changes to implement flexible binding for a feature. For Layered Aspects, all pointcuts related to an around advice defined within the feature implementation are redefined in the aspect that implements dynamic feature binding. Hence, if the 101Companies SPL is being modified to support flexible binding, we need to change the aspect containing feature code (TotalFeature) to support pointcut redefinition and we would need to redefine each pointcut related to around advice in order to associate it with driver code. Similarly, our first increment also demands these pointcut redefinitions and we need to introduce the annotations in the around advice, as explained in Section III. This could require a lot of changes. Second increment and AroundClosure. Applying these idioms demands few changes to implement flexible binding for a feature. As explained in Section III, the second increment and AroundClosure do not redefine pointcuts. Hence, neither major changes nor altering feature code are needed. 3) Instrumentation overhead (CIO): Now, we are interested in avoiding pointcuts that unnecessarily match join points. If we can exclude all the unnecessary instrumentation, we may gain in runtime due to the less instrumentation provided by the AspectJ compiler. Layered Aspects. Implementing flexible feature binding with this idiom may lead to instrumentation overhead because its adviceexecution pointcut matches more join points than necessary. The code of this idiom instruments all the pieces of advice within the feature implementation. However, the pieces of around advice are handled by the redefined pointcuts. This may lead to an overhead in the runtime as well. First increment. Our first increment annotates the around advice in collusion with the !@annotation(AroundAdvice) in the adviceexecution to avoid instrumentation overhead. In this way, the advicexecution pointcut only matches before and after advice, which eliminates the unnecessary instrumentation caused by the use of Layered Aspects. Second increment and AroundClosure. This increment and AdviceClosure do not present instrumentation overhead because their adviceexecution pointcut matches all the pieces of advice within the feature implementation only once. Hence, there is no unnecessary instrumentation. At last, we also provide all the source code used in this work and useful information for researchers to replicate our work [10]. V. T HREATS TO VALIDITY In this section, we discuss some threats to the validity of our work. We divided it in threats to internal and external validity. A. Threats to internal validity Threats to internal validity are concerned with the fact that the assessment leads to the results [31]. Therefore, in our work, these threats encompass the introduction of bias due to the selection of certain procedures, such as the way of feature code is assigned. Additionally, we discuss decisions that might introduce errors in our work and how we tried to circumvent them. BerkeleyDB refactoring. Our BerkeleyDB case study was originally refactored by Kästner et al. [15]. The code of its features was extracted into aspects. However, this extraction was not in accordance with the way we extracted the implementation of features of the other case studies. Therefore, we refactored the code of BerkeleyDB product line’s features so as to comply with the other feature implementations. Indeed, we followed the same procedures in order to refactor these implementations, such as the prune dependency rule. Feature code identification. We cannot assure that the extraction of our selected features does not present bias because the task of identifying feature code is in a certain way subjective. This could be a hindrance to researchers that might try to replicate our work. Indeed, there could be unconformities between feature code identified by different researchers [32]. However, we tried to minimize such a unreliability in two ways. First, we used the prune dependency rules [19] to identify feature code. These rules define some procedures that 82 the researcher should follow to avoid introducing bias in the resulting extracted feature code, as we mentioned in Section IV. Second, only one researcher identified the implementation of the selected features. We believe that restricting the number of people decreased unreliability. B. Threats to external validity Threats to external validity are concerned with generalization of the results [31]. Selected software product lines limitations. To perform our assessment, we selected four SPLs. They are written in Java and the feature code is implemented using aspectoriented programming. Therefore, we cannot generalize the results presented here for other contexts, such as different programming paradigms or languages. Nevertheless, the combination of Java and AspectJ can be used together in SPLs, which reinforce the significance of our new idioms. So the increments presented could be applied to other SPLs that comply with the technologies we considered. Multiple drivers absence. In this work, we only consider applying one driver at a time. However, we realize that some applications may depend on several conditions to activate or deactivate a certain feature. For instance, Lee et al. utilize a home service robot product line as case study [33]. This robot changes its configuration dynamically depending on the environment brightness or its remaining battery. It would demand at least two drivers to (de)activate some of its features in our context. Furthermore, the driver related boolean expression could become complex and hard to maintain, since simple boolean operations such as AND or OR may not work. Therefore, we reinforce that the mechanism that provides information to the driver is out of the scope of this work. Our proposal is to abstract the way our idioms receive this information. Even considering a complex boolean expression, its evaluation could be only true or false, and this is what our driver implementation needs to know. Nevertheless, we plan to study these scenarios in future work. AspectJ compiler dependence. As explained in Section III, our AroundClosure idiom depends on an internal resource of AspectJ’s compiler. Thereby, this idiom may not work when applied in scenarios where a different compiler is used. However, besides AspectJ compiler, which is popular, other well-known compilers, such as the ones used for CaesarJ [13] and ABC [14] also include the resource needed to apply AroundClosure idiom. Thus, we believe our idiom covers at least three popular compilers. VI. R ELATED WORK Besides Layered Aspects, we point out other researches regarding flexible binding times as well as studies that relate aspects and product line features. Additionally, we discuss how our work differs from them. Rosenmüller et al. propose an approach for statically generating tailor-made SPLs to support dynamic feature binding time [34]. Similarly to part of our work, they statically choose a set of features to compose a product that supports dynamic binding time. Furthermore, the authors describe a featurebased approach of adaptation and self-configuration to ensure composition safety. In this way, they statically select the features required for dynamic binding and generate a set of binding units that are composed at runtime to yield the program. Additionally, they implement their approach in one case study and evaluate it with concern to reconfiguration performance at runtime. Their contribution is restricted to applications based on C++, since they use the FeatureC++ language extension [35]. In contrast, our contribution is restricted to applications written mostly in Java, since we use AspectJ to provide flexible feature binding. In this way, our contribution applies to a different set of applications. Lee et al. propose a systematic approach to develop dynamically reconfigurable core assets, which lies in the management of dynamic binding time regarding changes during the product execution [33]. Furthermore, they present strategies to manage product composition at runtime. Thus, they are able to safely change product composition (activate or deactivate features) due to an event occurred during runtime. However, the authors only provide conceptual support for a reconfiguration tool with no actual implementation. Trinidad et al. propose a process to generate a component architecture that is able to dynamically activate or deactivate features and to perform some analysis operations on feature models to ensure that the feature composition is valid [36]. They apply their approach to generate an industrial real-time television SPL. However, they do not not consider crosscutting features, which is very common based on our experience. In contrast, our approach works with crosscutting features. Dinkelaker et al. [37] propose an approach that uses a dynamic feature model to describe variability and uses a domainspecific language for declaratively implementing variations and their constraints. Their work has mechanisms to detect and resolve feature interactions dynamically by validating an aspect-oriented model at runtime. Marot et al. [38] propose OARTA, which is a declarative extension to the AspectBench Compiler [14], which allows dynamic weaving of aspects. OARTA extends the AspectJ language syntax so that a developer can name an advice, which allows referring to it later on. It is possible that aspects weave on other aspects. Therefore, they exemplify how to dynamically deactivate features in runtime situations (e.g. features competing for resources, which may be deactivated to speed up the execution). By using AspectJ, we would have to add an if() pointcut predicate to the pointcut of the advice that contains feature code. This may lead to a high degree of driver code scattering. Thus, as shown in Section IV, our AroundClosure idiom does not present such issue. Rosenmüller et al. present an approach that supports static and dynamic composition of features from a single base code [39]. They provide an infrastructure for dynamic instantiation and validation of SPLs. Their approach is based on FOP [40] whereas our work uses AOP. Additionally, they use an extension called FeatureC++ [35] to automate dynamic instantiation of SPLs. However, the usage of C++ as a client language may introduce some specific problems. Static constructs when using dynamic composition, virtual classes, semantic differences when comparing static and dynamic compositions are examples of such problems [39]. Albeit our work uses only Java as a client language, we did not observe these problems in our implementations. An alternative proposal considers conditional compilation as a technique to implement flexible feature binding [41]. This work discusses how to apply conditional compilation in real applications like operating systems. Similarly to what we describe in our work, developers need to decide what features should be included to compose the product and their 83 respective binding times. However, the work concludes that, in fact, conditional compilation is not a very elegant solution to provide flexible feature binding. Hence, for complex variation points, the situation becomes even worse. Another proposal to implement flexible feature binding, which is also our previous work, considers aspect inheritance [6]. It defines an idiom that relies on aspect inheritance through the abstract pointcut definition. This solution states that we have to create an abstract aspect with feature code and an abstract pointcut definition, then we associate this driver with the advice. Furthermore, we create two concrete subaspects inheriting from the abstract one in order to implement the concrete driver. Despite of the fact that this solution enhances some weaknesses found in the literature, it is worst than the idioms presented in our previous work [4]. Chakravarthy et al. present Edicts [5], which is an AspectJbased idiom to implement flexible feature binding. The idea is to scatter feature code across one abstract aspect and two concrete subaspects. Then, the programmer implements the driver by adding if statements within the pieces of advice. However, our previous work [4] identified issues regarding code cloning, scattering, tangling, and size when applying Edicts to provide flexible feature binding. In this way, we reduce these issues with Layered Aspects and moreover, we fix the Layered Aspects issues with the AroundClosure idiom proposed in this work. VII. C ONCLUSION In this work, we identify deficiencies in an existing AspectJ-based idiom to implement flexible feature binding in the context of software product lines. To improve this idiom, we incrementally define a new idiom called AroundClosure. The creation of AroundClosure is performed increment-byincrement, which means that every increment corresponds to an improved idiom. To evaluate our new idioms, we perform an assessment regarding code cloning, scattering, tangling, and size. Furthermore, we qualitatively discuss these idioms with respect to code reusability, changeability, and instrumentation overhead. To achieve our conclusions, we base our analysis in 13 features of four different product lines and in our knowledge acquired during our research and previous work. VIII. ACKNOWLEDGMENTS We would like to thank colleagues of the Software Productivity Group (SPG) for several discussions that helped to improve the work reported here. Besides that, we would like to acknowledge financial support from CNPq, FACEPE, CAPES, and the Brazilian Software Engineering National Institute of Science and Technology (INES). R EFERENCES K. Pohl, G. Bockle, and F. J. van der Linden, Software Product Line Engineering. Berlin: Springer-Verlag, 2005. [2] M. Rosenmüller, N. Siegmund, S. Apel, and G. Saake, “Flexible feature binding in software product lines,” Automated Software Engineering, vol. 18, no. 2, pp. 163–197, 2011. [3] G. Kiczales, J. Lamping, A. Mendhekar, C. Maeda, C. Lopes, J.-M. Loingtier, and J. Irwin, “Aspect–oriented programming,” in Proceedings of European Conference on Object–Oriented Programming. Jyväskylä, Finland: Springer, Berlin, 9-13 June 1997, pp. 220–242. [4] R. Andrade, M. Ribeiro, V. Gasiunas, L. Satabin, H. Rebelo, and P. Borba, “Assessing idioms for implementing features with flexible binding times,” in Proceedings of the European Conference on Software Maintenance and Reengineering. Oldenburg, Germany: IEEE Computer Society, Washington, 1-4 March 2011, pp. 231–240. [1] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] V. Chakravarthy, J. Regehr, and E. Eide, “Edicts: Implementing features with flexible binding times,” in Proceedings of the International Conference on Aspect-Oriented Software Development. Brussels, Belgium: ACM, New York, 1-4 April 2008, pp. 108–119. M. Ribeiro, R. Cardoso, P. Borba, R. Bonifácio, and H. Rebêlo, “Does AspectJ provide modularity when implementing features with flexible binding times?” in Latin American Workshop on Aspect-Oriented Software Development, Fortaleza, Brazil, 04-05 October 2009, pp. 1–6. G. Kiczales, E. Hilsdale, J. Hugunin, M. Kersten, J. Palm, and W. G. Griswold, “Getting Started with AspectJ,” Communications of the ACM, vol. 44, no. 10, pp. 59–65, 2001. R. Laddad, AspectJ in Action: Enterprise AOP with Spring Applications. Manning Publications, 2009. M. Eaddy, “An empirical assessment of the crosscutting concern problem,” Ph.D. dissertation, Graduate School of Arts and Sciences, Columbia University, New York, 2008. R. Andrade, M. Ribeiro, H. Rebêlo, V. Gasiunas, L. Satabin, and P. Borba, “Flexible binding time,” 03 August 2013. [Online]. Available: http://preview.tinyurl.com/mkcaf5k J.-M. Favre, R. Lämmel, T. Schmorleiz, and A. Varanovich, “101companies: A community project on software technologies and software languages,” vol. 7304, pp. 58–74, 2012. E. Hilsdale and J. Hugunin, “Advice weaving in aspectj,” in Proceedings of the international conference on Aspect-oriented software development. Lancaster, UK: ACM, New York, 22-26 March 2004, pp. 26–35. I. Aracic, V. Gasiunas, M. Mezini, and K. Ostermann, “An overview of caesarj,” Transactions on Aspect-Oriented Software Development I, vol. 3880, pp. 135–173, 2006. P. Avgustinov, A. S. Christensen, L. Hendren, S. Kuzins, J. Lhoták, O. Lhoták, O. de Moor, D. Sereni, G. Sittampalam, and J. Tibble, “ABC: An extensible AspectJ compiler,” in Proceedings of the International Conference on Aspect-Oriented Software Development. Chicago, USA: ACM, New York, 14-18 March 2005, pp. 87–98. C. Kästner, S. Apel, and D. Batory, “A case study implementing features using AspectJ,” in Proceedings of the 11th International Software Product Line Conference. Kyoto, Japan: IEEE Computer Society, Washington, 10-14 September 2007, pp. 223–232. Tigris, “Argouml,” 03 August 2013. [Online]. Available: http: //argouml.tigris.org/ J. Müller, D. Polansky, P. Novak, C. Foltin, and D. Polivaev, “Free mind mapping software,” 03 August 2013. [Online]. Available: http://preview.tinyurl.com/5qrd5 K.-C. Kang, S. G. Cohen, J. A. Hess, W. E. Novak, and A. S. Peterson, “Feature-oriented domain analysis (FODA). feasibility study,” Tech. Rep. CMU/SEI-90-TR-21, 1990. M. Eaddy, A. Aho, and G. C. Murphy, “Identifying, assigning, and quantifying crosscutting concerns,” in Proceedings of the International Workshop on Assessment of Contemporary Modularization Techniques. Minneapolis, USA: IEEE Computer Society, Washington, 20-26 May 2007, pp. 2–7. V. Basili, G. Caldiera, and D. H. Rombach, “The goal question metric approach,” in Encyclopedia of Software Engineering. Wiley, New Jersey, 1994, pp. 528–532. I. Baxter, A. Yahin, L. Moura, M. Sant’Anna, and L. Bier, “Clone detection using abstract syntax trees,” in Proceedings of the International Conference on Software Maintenance. Bethesda, USA: IEEE Computer Society, Washington, 16-20 November 1998, pp. 368–377. T. Kamiya, F. Ohata, K. Kondou, S. Kusumoto, and K. Inoue, “CCfinder Official Site,” 03 August 2013. [Online]. Available: http://www.ccfinder.net/ D. R. School and D. C. Rajapakse, “An investigation of cloning in web applications,” in Proceedings of the Special Interest Tracks and Posters of the International Conference on World Wide Web. Sydney, Australia: Springer-Verlag, Berlin, 27-29 July 2005, pp. 252–262. C. J. Kapser and M. W. Godfrey, “Supporting the analysis of clones in software systems: A case study,” Journal of Software Maintenance and Evolution: Research and Practice, vol. 18, pp. 61–82, 2006. M. Bruntink, A. van Deursen, R. van Engelen, and T. Tourwe, “On the use of clone detection for identifying crosscutting concern code,” IEEE Transactions on Software Engineering, vol. 31, pp. 804–818, 2005. 84 [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] D. L. Parnas, “On the criteria to be used in decomposing systems into modules,” Communications of the ACM, vol. 15, no. 12, pp. 1053–1058, 1972. H. Rebêlo, R. Lima, and G. T. Leavens, “Modular contracts with procedures, annotations, pointcuts and advice,” in Proceeding of the Brazilian Symposium on Programming Languages, Sao Paulo, Brazil, 29-30 September 2011. G. Kiczales and M. Mezini, “Aspect-oriented programming and modular reasoning,” in Proceedings of the international conference on Software engineering. St. Louis, USA: ACM, New York, 15-21 May 2005, pp. 49–58. C. A. Cunha, J. a. L. Sobral, and M. P. Monteiro, “Reusable aspectoriented implementations of concurrency patterns and mechanisms,” in Proceedings of the international conference on Aspect-oriented software development, 2006, pp. 134–145. J. Hannemann and G. Kiczales, “Design pattern implementation in Java and AspectJ,” in Proceedings of the conference on Object-oriented programming, systems, languages, and applications. Seattle, USA: ACM, New York, 4-8 November 2002, pp. 161–173. C. Wohlin, P. Runeson, M. Höst, M. C. Ohlsson, B. Regnell, and A. Wesslén, Experimentation in software engineering: an introduction. Boston: Kluwer Academic, 2000. A. Lai and G. C. Murphy, “The structure of features in Java code: an exploratory investigation,” in Workshop on Multidimensional separation of concerns. Denver, USA: ACM, New York, 1-5 November 1999. J. Lee and K. C. Kang, “A feature-oriented approach to developing dynamically reconfigurable products in product line engineering,” in Proceedings of the International Software Product Line Conference. Baltimore, USA: IEEE Computer Society, Washington, 21-24 August 2006, pp. 131–140. M. Rosenmüller, N. Siegmund, M. Pukall, and S. Apel, “Tailoring dynamic software product lines,” in Proceedings of the international conference on Generative programming and component engineering. Portland, USA: ACM, New York, 22-23 October 2011, pp. 3–12. S. Apel, T. Leich, M. Rosenmuller, and G. Saake, “FeatureC++: On the symbiosis of feature-oriented and aspect-oriented programming,” in Proceedings of the International Conference on Generative Programming and Component Engineering. Tallinn, Estonia: Springer-Verlag, Berlin, 29-1 September 2005, pp. 125–140. P. Trinidad, A. R. Cortés, J. Peña, and D. Benavides, “Mapping feature models onto component models to build dynamic software product lines,” in Proceedings of the International Software Product Line Conference. Kyoto, Japan: Kindai Kagaku, Tokyo, 10-14 September 2007, pp. 51–56. T. Dinkelaker, R. Mitschke, K. Fetzer, and M. Mezini, “A dynamic software product line approach using aspect models at runtime,” in Proceedings of the Workshop on Composition and Variability, Rennes and Saint Malo, France, 15-19 March 2010. A. Marot and R. Wuyts, “Composing aspects with aspects,” in Proceedings of the International Conference on Aspect-Oriented Software Development. Rennes and Saint-Malo, France: ACM, New York, 15-19 March 2010, pp. 157–168. M. Rosenmüller, N. Siegmund, G. Saake, and S. Apel, “Code generation to support static and dynamic composition of software product lines,” in Proceedings of the International Conference on Generative Programming and Component Engineering. Nashville, TN, USA: ACM, New York, 19-23 October 2008, pp. 3–12. C. Prehofer, “Feature-oriented programming: A fresh look at objects,” in Proceedings of the European Conference on Object-Oriented Programming. Jyväskylä, Finland: Springer, Berlin, 9-13 June 1997, pp. 419–443. E. Dolstra, G. Florijn, and E. Visser, “Timeline variability: The variability of binding time of variation points,” Tech. Rep. UU-CS-2003-052, 2003. Towards Validating Complexity-based Metrics for Software Product Line Architectures Anderson Marcolino, Edson Oliveira Junior, Itana Gimenes Tayana U. Conte Informatics Department State University of Maringá Maringá-PR, Brazil Email: [email protected], {edson, itana}@din.uem.br Institute of Computing Federal University of Amazonas Manaus-AM, Brazil Email: [email protected] Abstract—Software product line (PL) is an approach that focuses on software reuse and has been successfully applied for specific domains. The PL architecture (PLA) is one of the most important assets, and it represents commonalities and variabilities of a PL. The analysis of the PLA, supported by metrics, can be used as an important indicator of the PL quality and return on investment (ROI). This paper presents the replication of a controlled experiment for validating complexity metrics for PLAs. In particular, in this replication we are focused on evaluating how subjects less-qualified than the subjects from the original experiment evaluate complexity of a PLA by means of generated specific products. It was applied a PLA variability resolution model of a given PL to a sample of subjects from at least basic knowledge on UML modeling, PL and variability management. Apart of the selection of different subjects, the same original experiment conditions were kept. The proposed PLA complexity metrics were experimentally validated based on their application to a set of 35 derived products from the Arcade Game Maker (AGM) PL. Normality tests were applied to the metrics observed values, thus, pointing out their non-normality. Therefore, the non-parametric Spearman’s correlation ranking technique was used to demonstrate the correlation between the CompPLA metric and the complexity rate given by the subjects to each derived product. Such a correlation was strong and positive. The results obtained in this replication shown that even less-qualified subjects, compared to the subjects from the original experiment, are able to rate the complexity of a PLA by means of its generated products, thus corroborating the results of the original experiment and providing more evidence that the composed metric for complexity (CompPLA) can be used as a relevant indicator for measuring the complexity of PLA based on their derived products. Keywords—Correlation Analysis, Emprical Validation, Metrics, Replication, Software Product Line Architecture. I. I NTRODUCTION In recent years, the software product line (PL) engineering [11] has emerged as a promising reusability approach, which brings out some important benefits, such as increases the reusability of its core assets, while decreases the time to market. One of the most important assets of a PL is its architecture (PLA). The PLA 1 plays a central role at the development of products from a PL as it is the abstraction of the products that can be generated, and it represents commonalities and variabilities of a PL [20]. 1 In this work, we are only taking into account the PL logical architecture. The evaluation of a PLA must be supported by a set of metrics [6]. Such metrics must both evidence the quality of a PL and serve as a basis to analyze the managerial and economical value of a PL. The PLA must explicit the common (similarities) and variable (variabilities) aspects of a PL. The variability impact analysis on the PL development can determine the quality and the aggregated value of a PL for an organization. Metrics for a PLA are applied to a set of assets from which variants can be generated rather than one specific product. Thus, it is necessary to define specific PLA metrics to provide effective quality indicators with regard to the overall PL development and evolution. Therefore, metrics for PLA complexity were proposed in previous works [14], [15]. These metrics were defined to provide an indicator of how complex is a PLA by measuring its derived PL products. Complexity is measured based on McCabe’s Cyclomatic Complexity (CC) [12]. PLA evaluation is essential for providing an indicator of a PL quality taking into account the behavior of the generated products. As the PLA is one of the most important asset of a PL, by taking into consideration two or more quality attributes measured from a given PLA, one can perform trade-off analysis for identifying which quality attribute(s) must be prioritized for the evolution and development of a PL [16]. An initial experimental validation of the complexity metrics, the original experiment, was done in [15], providing initial evidence that such metrics can be used as complexity indicators for PLAs. Thus, this paper aims to provide more evidence that the proposed PLA complexity metrics can be used as a relevant indicator of PLA complexity by carrying out an internal replication of the original experiment. The replication is justified in [13] [19], and must be conducted to ensure more robust evidence sets that can support generally applicable conclusion. The remainder of this paper is organized as follows: Section II presents the description of the complexity metrics; Section III presents the original experiment previously carried out and its results; Section IV discusses how the replication was planned and carried out; Section V analyzes the differences between the two experiments (original and replication); and Section VI presents the conclusions and directions for future work. 85 II. C OMPLEXITY M ETRICS FOR S OFTWARE P RODUCT L INE A RCHITECTURES A Software Product Line (PL) represents a set of systems sharing common features that satisfy the needs of a particular market or mission segment [11], [17]. This set of systems is also called a product family. The family’s members are specific products developed in a systematic way from the PL core assets. The core asset has a set of common features as well as a set of variable parts, which represent later design decisions [17]. The composition and the configuration of such assets yield specific products. PL adoption has been increasing in recent years. Success stories from several companies have been reported in the literature, including: Philips, Bosch, Nokia, and Toshiba [11]. The benefits obtained with the PL approach include [11]: better understanding of the domains, more artifact reuse, and less time to market. The adoption of a PL approach requires long term management, since the benefits are not immediately obtained. A wide variety of products should be produced to consolidate the PL adoption and, therefore, provide a return on investment (ROI) [3]. In order to produce PL products and to take advantage of PL benefits, there must be an intensive PL architecture design and evaluation, as well as an efficient variability management. The PL architecture (PLA) plays a central role to successfully generate specific products taking into account the development and evolution of a PL. It abstractly represents the architecture of all potential PL products from a specific domain. The PLA addresses the PL design decisions by means of their similarities, as well as their variabilities [20]. Organizations should continuously evaluate the quality of their products by managing their PL evolution and variabilities. Thus, the PLA evaluation should be taken into consideration as one of the most important activities throughout a PL life cycle [11]. Architecture evaluation is an important activity of software design. Informal evaluations, based on use case scenarios, for instance, are widely performed. However, most of the time they do not generate accurate results [11]. Although there are more rigorous and consolidated evaluation methods in the literature, such as Architecture Tradeoff Analysis Method (ATAM) and Software Architecture Analysis Method (SAAM) [5], the evaluation of a PLA [11], [17] requires particular attention due to variability issues. Such an evaluation should take into account issues such as: the relevant quality attributes of a PLA; the time when the PLA is evaluated; and the techniques and metrics used to evaluate the PLA [7]. The evaluation of a quality attribute-based PLA can be used as a parameter for evaluating a PL in general [7]. By trading-off the PLA quality attributes it is possible for PL managers and architects to prioritize which quality attribute must be taken into consideration during PLA evolutions. This occurs because the PLA quality attributes take into account variabilities, which can be used as a parameter to the quality evaluation of an overall PL. The evaluation of a PLA also requires a set of quality attribute metrics that can provide evidence of the PL quality, thus serving as the basis to analyze the managerial and economical values of a PL [3]. Hoek et al. [8] present a set of metrics to evaluate a PLA with regard to its structural soundness quality attribute. Rahman [18] proposes a set of metrics to evaluate the component structure of a PLA based on quality attributes. Both work propose metrics to support the evaluation of a PLA. However, they do not take into consideration PLA variabilities represented in PLA models to support quantitative analysis and improve qualitative analysis. This means that their work does not allow the analysis of PLA behavior based on variabilities which is interesting to analyze the PLA return on investment. The proposed complexity metrics for PLA [14]–[16] were composed based on the Cyclomatic Complex (CC) [12] and Weighted Methods per Class (WMC) [4] metrics. The CC metric measures the quantity of decision logic represented by the number of paths to be tested in a source code. The WMC metric is the sum of the CC metric for each concrete method in an object-oriented class. Each metric measures the complexity of class, interface and component and were used in the original experiment [15]. The complexity metrics taken into consideration in this paper are as follows [15]: CompClass: measures the complexity of a class. It is the value of WMC metric for a class. This metric is represented by the following formula: CompClass(Cls)= n X W M C(M tdi ), i=1 where: • n = # of concrete methods (Mtd) of a class (Cls) (1) Interfaces always have CompClass value 0.0 as they do not provide concrete methods. CompPLA: measures the complexity of a PLA. It is the sum of the metric CompClass (Equation 1), for each class in a PLA. This metric is represented by the following formula: X nCls CompPLA(PLA) = CompClass(Clsi ), i=1 where: • nCls = # of classes with associated variability • Clsi is the ith class of a PLA (2) Once the PLA complexity metrics were proposed, they must be experimentally validated to provide initial evidence of their capability to be used as relevant indicators for PLA complexity. Figure 1 depicts the activities to gathering evidence to experimentally validate the proposed metrics. Proposition and theoretical validation of the complexitybased metrics were done in [14]. An original experiment to 86 g Evidence to Validate Complexity-based Metrics for Product-Line Architectures [ Gathering Evidence to Validate Complexity-based Metrics for Product-Line Architectures ] Com plexity-based Metrics for Product-Line Architectures Propose Com plexitybased Metrics for Product-Line Architectures Validate Com plexity Metrics Theoretically Validate Com plexity Metrics Experim entally Replicate Internally Original Experim ent Initial Evidence of Com plexity Metrics Validation Replicate Externally Original Experim ent Gathering Evidence to Validate the Proposed Com plexity-based Metrics More Evidence of Com plexity Metrics Validation end start Figure 1. The Process of Gathering Evidence to Validate Complexity-based Metrics for PLAs. validate the metrics was carried out in [15] and summarized in Section III. Section IV presents the replication of the original experiment and Section V discusses how more evidence to validate the proposed metrics is gathered. the AGM UML class model; and the resolution model containing the variabilities to be resolved at class level. 5) Hypothesis Formulation: the following hypotheses were tested in this study3 : • Null Hypothesis (H0 ): There is no significant correlation between the PLA complexity metric (CompPLA) and the subject’s complexity rating for each PLA configuration; and • Alternative Hypothesis (H1 ): There is a significant correlation between the PLA complexity metric (CompPLA) and the subject’s complexity rating for each PLA configuration. 6) Experiment Design: all the tasks had to be solved by each of the subjects. • • III. T HE O RIGINAL E XPERIMENT AND ITS R ESULTS This section presents the original experiment planned and carried out in [15]. A. Definition Based on the Goal-Question-Metric (GQM) template [1], the goal of the original experiment was: analyze collected metrics from UML models and source code, for the purpose of validating, with respect to the capability to be used as PLA complexity indicators, from the point of view of software product line architects, in the context of the generation of specific products from the the Software Engineering Institute’s C. Operation Arcade Game Maker (AGM) PLA by graduate students of the 1) Preparation: when the original experiment was carried University of Waterloo (UWaterloo), University of São Paulo out, all of the subjects had graduated in the Software En(ICMC-USP), and State University of Maringá (UEM). gineering area, in which they have learned how to design w UML, 1-1 C:\Users\Edson\Dropbox\orientacoes\mestrado\2012-2014 - Anderson Marcolino\RS-AndersonMarcolino-2012\2013\CASC at least object-oriented (OO) class diagrams using UML. B. Planning In addition, all of the subjects had experience in applying 1) Context Selection: the experiment was carried out in an PL and variability concepts to OO systems designed using academic environment. UML. The material prepared to the subjects consisted of: 2) Selection of Subjects: a group of Software Engineering graduate students from ICMC-USP, UEM, and UWater• the AGM PL description; loo. They have experience in the design of product lines • the AGM PL core asset class diagram; and variabilities using UML. • a variability resolution model, which the subjects 3) Variable Selection: the independent variables were the must resolve the variabilities to generate one AGM AGM2 PL and its class model. AGM was created by the configuration; and Software Engineering Institute (SEI) to support learning • a test (questionnaire) describing complexity conand experimenting based on PL concepts. It has a comcepts, which the subjects had to rate the associated plete set of documents and UML models, as well as a complexity of each generated AGM configuration set of tested classes and source code for three different based on linguistic labels, composed of five values: games: Pong, Bowling, and Brickles. The dependent extremely low, low, neither low nor high, high and variables were the complexity of each product generated extremely high. from the AGM PLA. 2) Execution: the subjects were given the material described 4) Instrumentation: the objects were: in Preparation (Section III.C - 1). It was required to each • a document describing the AGM PL; 2 http://www.sei.cmu.edu/productlines/ppl 3 As the metric CompPLA is a composition of the CompClass metric, it is only taken CompPLA into consideration for the validation purpose. 87 subject to generate one AGM configuration. It was done by following instructions on how to resolve the AGM variability resolution model, and how to rate the complexity associated to the configurations generated from the subjects view point. All the tasks were performed by each subject, with no time limit to solve them. In addition, the CompPLA value of each configuration was divided by the CompPLA value of the overall AGM PLA, thus resulting in a value ranging from 0.0 to 1.0. 3) Data Validation: the tasks realized by the subjects were collected. It is considered the subjects subjective evaluation reliable as we make it clear that the subjects complexity rating are essential, thus we asked them to be as honest as possible. In addition, the process of answering the complexity rating was explained previously, increasing the reliability of the answer gave by the subject. resulted in a strong positive correlation (ρ = 0.93) (Equation 4). It means that there is evidence that the complexity metrics are significantly correlated to the complexity of a PLA and, therefore, the experiment null hypothesis (H0 ) (Section III-B) must be rejected. ρ(Corr.1) = 1− 30(3062 −30) ∗293.5 = 1−0.07 = (4) 0.93 Obtained results of the original study lead to understand that there is evidence that the metric CompPLA is a relevant indicator of PLA complexity based on its correlation to the subject’s rating for the given AGM PL. D. Validity Evaluation Collected data summarized by the experiment team calculating the metric CompPLA for the thirty AGM configurations generated by the subjects, as well as verifying the complexity rating of such configurations. 1) Descriptive Statistics: Figure 2 presents the CompPLA observed values plotted in a boxplot. It shows the CompPLA values distribution, with sample size (N = 30) and median value (µ) 0.5895. 2) Normality Tests: Shapiro-Wilk and Kolmogorov-Smirnov normality tests were conducted for the CompPLA metric distribution values. The results of both tests show that the distribution is non-normal at a significance level of 95%. Therefore, a non-parametric statistical method must be used to analyze the obtained data. 3) Spearman’s Rank Correlation: as CompPLA distribution is non-normal, the non-parametric Spearman’s Correlation (ρ) was applied to support data interpretation. This method allows establishing whether there is a correlation between two sets of data. The values of correlation is applied in a scale that ranges from -1.0 to 1.0, as follows (Figure 3): -1.0 - perfect negative correlation, between -1.0 and -0.49 - strong negative correlation, between -0.5 and -0.01 - weak negative correlation, 0 - no correlation at all, between 0.01 and 0.49 - weak positive correlation, between 0.5 and 0.99 - strong positive correlation, and 1.0 - perfect positive correlation. Equation 3 presents the Spearman’s ρ formula: ρ=1− 6 n(n2 −1) Pn i=1 d2i , where: n is the sample size (N) • • • (3) The following correlation (Corr.1) was performed: CompPLA and the subjects complexity rating, which shows that the understanding of complexity by the subjects corroborates to the CompPLA metric, establishing how to measure complexity in PLA. Such a correlation 88 • Threats to Conclusion Validity: the only issue taken into account as a risk to affect the statistical validity is the sample size (N=30), which must be increased during prospective replications of this study in order to effectively reach the conclusion validity. Threats to Construct Validity: our dependent variable is the complexity metric CompPLA. Subjective measures were proposed for such a metric, as linguistic labels, collected based on the subjects rating. As the subjects have experience in modeling OO systems using at least class diagrams, their ratings were taken as significant. The construct validity of the metric used as the independent variable is guaranteed by some insights (theoretical validity) carried out on a previous study of metrics for PLA [14]. Threats to Internal Validity: the following issues were dealt with: – Fatigue effects. The experiment lasted for 69 minutes in average (33 minutes at least and 98 minutes at most), thus fatigue was considered not relevant. In addition, the variability resolution model contributed to reduce such effects. – Measuring PLA and Configurations. As PLA can be analyzed based on its products (configurations), measuring derived configurations provides a means to analyze PLA quality attributes by allowing the performing of trade-off analysis to prioritize such attributes. Thus, it was considered valid the application of the metrics to PLA configurations to rate the overall PLA complexity. – Other influence factors. Influence among subjects could not really be controlled. Subjects realized the experiment under the supervision of a human observer. Thus, this issue did not affect the study validity. Threats to External Validity: two threats were identified: – Instrumentation. It was tried to use a representative class diagram of real cases. However, the PL used 1) Descriptive Statistics: Figure 1 presents the CompPLA observed values (Table II) plotted in a boxplot. ρ = 1− 6 n(n2 −1) size (N ) n X d2i , wh i=1 Box Plot of CompPLA AGM Experiment 2v*30c 1.1 - 1.0 strong negative correlation weak negative correlation - 0.5 0 1.0 perfect negative correlation no corre 0.9 Figure 2. Spearman’s R CompPLA 0.8 0.7 0.6 0.5 Median = 0.5895 25%-75% = (0.505, 0.821) Non-Outlier Range = (0.4, 1) Outliers Extremes 0.4 0.3 Figure The Original Experiment DataObserved Boxplot [15]. Figure 1. 2.Boxplot for the CompPLA Values. We performed the following PLA and the subjects complex the understanding of complexity to the CompPLA metric, estab plexity in PLA. Table III presents the Spearm Corr.1. The Spearman ρ coeffi is calculated as follows: ρ(Corr.1 ) = 1− 30(3062 −30) × 1 − 0.07 = 0.93 weak positive strong positive weak negative strong negative - 0.5 0.5 0 2) Normality Tests: We can clearly observe that the1.0 correlation correlation correlation correlation CompPLA values distribution (Figure 1) is non-normal. In Thus, according to Figure perfect negative perfect positive no correlation spite of it, Shapiro-Wilk and Kolmogorov-Smirnov normalcorrelation correlation correlation (ρ(Corr.1 ) = 0.93) b ity tests were conducted to make sure of it. and the subjects complexity ra Figure 3. Spearman’s Rank Correlation Scale [15]. The following hypothesis were proposed for both normalBased on the proposed corr ity tests with regard to the CompPLA metric: reject the null hypothesis H0 in the experiment is• non-commercial, and(H some systematic phases to replicate studies and tacit knowledge alternative hypothesis H1 (Sect Null Hypothesis ): the CompPLA observed values 0 assumptions can be madedistribution on this issue.isThus, more sharing, by means of the creation of replication packages [19].are significa complexity metrics normal, i.e., the significance value (p) is experimental studies taking a real PL from software By taking into account the FIRE framework, this replication of PLA. greater than 0.05 (p > 0.05); and - 1.0 organizations must be done. – Subjects. 17 students claim that they had industrial experience. Obtaining well-qualified subjects was difficult, thus we took into consideration advanced graduate students from the Software Engineering academia as we understand the feasibility of using students as subjects [9]. IV. I NTERNAL R EPLICATION OF THE O RIGINAL E XPERIMENT According to Juristo and Gómez [10], replication is a key component of experimentation. Experimental results form a body of knowledge as long as they are considerably confirmed. Such a confirmation might be achieved by means of replications and results analysis. Once similar results are obtained, there is evidence that the study is closer the reality to which it was carried out allowing one to generalize its conclusions. As the original experiment sample size is small, we needed to replicate such an experiment to gather more evidence for validating the complexity metrics by improving the conclusion generalization. The gathering of evidence to validate complexity metrics is based on the framework FIRE, proposed by [13] as a joint project, called READERS, between the University of Maryland and the University of São Paulo. FIRE defined is considered an Internal Replication, which was planned and carried out by the same research group of the original experiment. External replications are setting up for prospective replications by different universities, such as, Pontifical Catholic University of Rio Grande do Sul and Pontifical Catholic University of Rio de Janeiro. This replication type was rigorous, according to Basili et al. [2], in which one aims at duplicating the original experiment under the same original conditions. This kind of replication increases the accuracy of the original experiment conclusions. A. Defining the Replication The goal of this replication was: analyse collected metrics from UML models and source code, for the purpose of validating, with respect to the capability to be used as PLA complexity indicators, from the point of view of software product line architects, in the context of the generation of specific products from the SEI’s Arcade Game Maker (AGM) PLA by graduate students. B. Planning the Replication 89 • Hypothesis Formulation: the following hypothesis were tested in this replication: – Null Hypothesis (H0 ): There is no significant correlation between the PLA complexity metric (Comp- PLA) and the subject’s complexity rating for each PLA configuration; and – Alternative Hypothesis (H1 ): There is a significant correlation between the PLA complexity metric (CompPLA) and the subject’s complexity rating for each PLA configuration. The replication planning follows exactly the planning of the original experiment (Section III-B) with regard to: context selection, variable selection, instrumentation, and experiment design. However, for this replication, the subjects selection was made by convenience only selecting a group of Software Engineering graduate students only from the State University of Maringá (UEM). Therefore, the objective in this replication was to identify the capability of less-qualified subjects on rating the complexity of PLAs. The subjects were considered less-qualified according to the answers given in the questionnaire of characterization. In such a questionnaire, the subjects chose only one of the following alternative based on their PL and variability management expertise: 1 I have never heard about PL. 2 I have ever read about PL. 3 I have basic experience in PL with regard to: PL development cycle and its activities (domain and application engineering). However, I do not have any experience with variability management. 4 I have moderate experience. I know every concept presented in the previous option. With regard to variability management, I know what are variation points and variants and I am able to resolve variabilities and specify binding times (design time, link time, runtime). 5 I have advanced experience in PL. I know every concept presented in the previous option. With regard to variability management, I know every concept from the previous option plus resolution models, existing variability management approaches, and how to represent variabilities (by using UML and feature models). In order to keep the subjects with almost the same PL and variability management knowledge, a session training regarding PL and variability management was performed to the subjects. C. Conducting the Replication 1) Preparation: the original experiment results shown that CompPLA can be used as an important PLA complexity indicator by taking into account expert subjects. However, in this replication we are focused on selecting subjects that have moderate experience in designing class models in UML and almost none in designing PL and variabilities in UML to identify their capability of rating the complexity of PLAs. The subjects characterization is presented as follows: • 8 (eight) subjects never used UML for modeling software; • 18 (eighteen) subjects have basic experience in UML modeling; 9 (nine) subjects have moderate experience in UML modeling; • 13 (thirteen) subjects never heard about software product lines; • 11 (eleven) subjects have read about software product lines; • 8 (eight) subjects have basic experience in software product lines; and • 3 (three) subjects have moderate experience in software product lines. Instrumentation was the same as the original experiment, as the subjects must assign one complexity label (Section 3.3 - 2) to each generated configuration. 2) Execution: subjects were divided into two blocks. The experiment was carried out in two sessions, one with 20 students and one with 15 students. Two sessions were conducted, to attend the availability of subjects. 3) Data Validation: tasks performed by the subjects were collected. We consider all the 35 subjects subjective rating reliable. • D. Replication Analysis and Interpretation The CompPLA metric was calculated for each configuration. 1) Descriptive Statistics: Figure 4 presents a boxplot with sample size (N=35) and median value 0.841. 2) Normality Tests: Shapiro-Wilk and Kolmogorov-Smirnov normality tests were applied to the AGM sample. Both normality tests indicated that for a sample size (N=35), mean value of 0.8006, standard deviation value of (σ) 0.1330, and median (x̃) value of 0.841, the CompPLA sample is non-normal. 3) Spearman’s Rank Correlation: as in the original experiment, the non-parametric technique of Spearman’s correlation was applied for the CompPLA sample and to the subjects complexity rating for each AGM configuration. Therefore, Corr.2 was defined. Equation 5 shows the Corr.2 calculated Spearman correlation value based on data from Table I. ρ(Corr.2) = 1− 35(3562 −35) ∗4.446 = 1−0.63 = (5) 0.37 Therefore, according to Equation 5, there is a weak positive correlation between the CompPLA value and the subject’s complexity rating, as Corr.2 = 0.37. Such a correlation value provides evidence that the null hypothesis of this replication study (Section IV-B) must be rejected whereas the alternative hypothesis must be accepted. Thus, such a replication study corroborates, by evidence, the original experiment by taking the CompPLA metric as a relevant indicator of PLA complexity based on derived products. E. Replication Validity Evaluation 90 1) Conclusion Validity: this risk was minimized as in the original experiment it was taken a sample of 30 subjects Figure 4. The Replication Data Boxplot. I Table 3: Elements for Calculating the Spearman’s Table Rank Correlation for the CompPLA and the Subject’s R EPLICATION DATA FOR C ALCULATING THE S PEARMAN ’ S C ORRELATION . Complexity Rating Samples. Config] CompPLA ra Subject’s rb d d2 Config.] CompPLA ra |ra − rb | Complexity Subject’s rb Rating d d2 |ra − rb | Complexity Rating 1 1.000 1 Extremely High 1 0 0 19 0.899 9 High 19 -10 100 2 0.972 3 Extremely High 2 1 1 20 0.855 13 High 20 -7 49 3 0.972 4 Extremely High 3 1 1 21 0.849 14 High 21 -7 49 4 0.972 5 Extremely High 4 1 1 22 0.849 15 High 22 -7 49 5 0.956 11 Extremely High 5 6 36 23 0.841 17 High 23 -6 36 6 0.875 12 Extremely High 6 6 36 24 0.755 23 High 24 -1 1 7 0.875 18 Extremely High 7 11 121 25 0.624 28 High 25 3 9 8 0.839 19 Extremely High 8 11 121 26 0.598 33 High 26 7 49 9 0.833 20 Extremely High 9 11 121 27 0.598 34 High 27 7 49 10 0.833 24 Extremely High 10 14 196 28 0.594 35 High 28 7 49 11 0.710 25 Extremely High 11 14 196 29 0.956 7 Neit.Low nor High 29 -22 484 12 0.620 29 Extremely High 12 17 289 30 0.849 16 Neit.Low nor High 30 -14 196 13 0.620 30 Extremely High 13 17 289 31 0.801 21 Neit.Low nor High 31 -10 100 14 0.614 31 Extremely High 14 17 289 32 0.799 22 Neit.Low nor High 32 -10 100 15 0.614 32 Extremely High 15 17 289 33 0.666 26 Neit.Low nor High 33 -7 49 16 0.984 2 High 16 -14 196 34 0.666 27 Neit.Low nor High 34 -7 49 17 0.972 6 High 17 -11 121 35 0.881 10 Low 35 -25 625 18 0.926 8 High 18 -10 100 and in this replication it was taken 35. The results of the replication corroborated the results of the original experiment. 2) Construction Validity: the main difference between the original experiment and this replication was the subjects characterization. However, it was alleviated as the objective of this replication is analyze the capability of lessqualified subjects on rating the complexity of PLAs. 3) Internal Validity: the following threats were dealt in this replication: • Accuracy of the Subjects Responses: to minimize this threat, each generated product from the AGM • PLA was checked whether it is valid; and Fatigue Effects: the experiment taken in average 63 minutes. Thus, fatigue was not a critical issue in this replication. 4) External Validity: instrumentation was the main external threat as the UML class model used for this replication was the same used in the original experiment. New experiments must be planned and conducted by applying a real (commercial) PL to allow generalize conclusions. 91 Figure 5. Figure 6. Excerpt of the AGM PLA. Example of an AGM Brickles Game Configuration. 92 Figure 7. Example of an AGM Pong Game Configuration. V. G ATHERING E VIDENCE TO VALIDATE THE PLA C OMPLEXITY M ETRICS The overall experiment (original plus replication) analysis aims at gathering evidence to validate complexity metrics for PLAs. This validation is based on the correlation of the complexity metrics and the subjects complexity rating taken into account the subjects experience. Such a correlation is calculated based on the normality of the two sets of values (CompPLA metric and the subject rating). Figure 5 presents an excerpt of the AGM PLA. Subjects created 35 specific products (configurations), based on such a PLA. Figure 6 and Figure 7 show two possible products created by subjects from the AGM PLA. For each created configuration, the subjects assigned a complexity rate. The CompPLA metric was calculated, enabling the correlation between complexity metric values and complexity assigned by the subjects. As the original experiment sample (N=30) was non-normal, the Spearman’s ranking correlation was applied and it provided the value 0.93, which means a strong positive correlation. This value was more than expected, as PL is an emerging topic in software engineering. Taking into consideration the replication presented in this paper, the sample (N=35) was non-normal as the original experiment. Thus, Spearman’s correlation was also applied. Such a correlation resulted in the value 0.37, which means a weak positive correlation. We initially expected a lower value than 0.37 based on the replication subjects characterization analysis. However, it was confirmed that even with lessqualified subjects, than the original experiment, we obtained a positive correlation. We can observe that the samples sizes (original and replication) were almost the same. Such samples were non-normal and the Spearman’s rank correlation was used to provide a value in the Spearman’s scale, reflecting the correlation level of CompPLA and the subjects complexity rating for the original experiment and this replication. The replication provided evidence that the CompPLA can be used as a relevant indicator for PLA complexity, which confirms the original obtained evidence. The main difference between the original experiment and its replication is the subjects characterization. While in the original experiment the subjetcs were more qualified, its replication took into account five more subjects, but less-qualified compared to the original sample. This replication might lead us to infer that even not well-qualified subjects are able to state how complex is a PL product based on its PLA. It may imply: (i) metrics are measuring what they must measure; and (ii) the amount of variability in the resolution model, which was given the subjects to generate a PL product, does not lead to a misunderstanding of the PLA and consequently the appropriate generation of a PL product. If (ii) is true, this replication provided initial evidence to investigate whether the variability representation approach used to model variability in the PLA class diagram is effective, contributing to improve the diagram readability by each subject. VI. C ONCLUSION Evaluation of the quality of a PL is an essential issue in the context of the PL approach. One might analyze the quality of a PL by means of its PLA and derived products. This kind 93 of analysis can be supported by metrics, which provides both quantitative and qualitative ways to interpret quality factors of PLs, as shown in Section II. This paper presented the gathering of evidence to validate complexity-based metrics for PLAs by means of: (i) the analysis of the original experiment previously carried out; (ii) the replication of such an original experiment; and (iii) an analysis of the differences between the original and replicated experiments taking into account different subjects characterization to gathering evidence to validate such metrics for PLAs. An original experiment was replicated, keeping the same original conditions. The CompPLA composed metric for PLAs was applied to several AGM PL configurations, as well as subjects rated the complexity of such configurations based on their experiences. Thus, normality tests were applied to the sample and the Spearman’s rank correlation supported the data analysis of this replication. Obtained results from this replication corroborated previous results from the original experiment, providing evidence that CompPLA can be used as a relevant PLA complexity indicator. Although this replication corroborated the original experiment, it is clear that new replications must be conducted and the application of the complexity metrics to a real/commercial PL must be performed to allow generalizing the conclusions. Directions for future work include: (i) plan and conduct one or more external replications keeping the same conditions as this one, but improving the subjects selection by taking into consideration more qualified subjects aiming at reducing the threats to the experiment; (ii) replicate this experiment in industry, with a group of practitioners taking the FIRE framework into account [13]; (iii) plan and conduct a new experiment for PLA extensibility metrics allowing one to perform trade-off analysis of PLA quality attributes; and (iv) analyze the consolidated data provided by all replications to provide a means to generalize the conclusions with regard to complexity-based proposed metrics for PLA. [7] L. Etxeberria and G. Sagardui. Variability Driven Quality Evaluation in Software Product Lines. In Proceedings of the Software Product Line Conference, pages 243–252, Washington, DC, USA, 2008. IEEE Computer Society. [8] A. v. d. Hoek, E. Dincel, and N. Medvidovic. Using Service Utilization Metrics to Assess the Structure of Product Line Architectures. In Proceedings of the International Symposium on Software Metrics, pages 298–308, Washington, DC, USA, 2003. IEEE Computer Society. [9] M. Höst, B. Regnell, and C. Wohlin. Using Students as Subjects-A Comparative Study of Students and Professionals in Lead-Time Impact Assessment. Empirical Software Engineering, 5(3):201–214, 2000. [10] N. Juristo and O. S. Gómez. Empirical Software Engineering and Verification. Springer-Verlag, Berlin, Heidelberg, 2012. [11] v. d. F. J. Linden, K. Schmid, and E. Rommes. Software Product Lines in Action: The Best Industrial Practice in Product Line Engineering. Springer, Berlin, 2007. [12] T. J. McCabe. A Complexity Measure. IEEE Trans. Software Eng., 2(4):308–320, 1976. [13] M. Mendonca, J. Maldonado, M. de Oliveira, J. Carver, S. Fabbri, F. Shull, G. H. Travassos, E. Hohn, and V. Basili. A Framework for Software Engineering Experimental Replications. In IEEE International Conference on Engineering of Complex Computer Systems, 2008., pages 203–212, 2008. [14] E. A. Oliveira Junior, I. M. S. Gimenes, and J. C. Maldonado. A Metric Suite to Support Software Product Line Architecture Evaluation. In Conferencia Latinoamericana de Informática, pages 489–498, 2008. [15] E. A. Oliveira Junior, I. M. S. Gimenes, and J. C. Maldonado. Empirical Validation of Variability-based Complexity Metrics for Software Product Line Architecture. In International Conference on Software Engineering and Knowledge Engineering, pages 622–627. Knowledge Systems Institute Graduate School, 2012. [16] E. A. Oliveira Junior, I. M. S. Gimenes, J. C. Maldonado, P. C. Masiero, and L. Barroca. Systematic Evaluation of Software Product Line Architectures. Journal of Universal Computer Science, 19(1):25– 52, 2013. [17] K. Pohl, G. Böckle, and F. J. v. d. Linden. Software Product Line Engineering: Foundations, Principles, and Techniques. Springer-Verlag, Secaucus, NJ, USA, 2005. [18] A. Rahman. Metrics for the Structural Assessment of Product Line Architecture. Master’s thesis, School of Engineering - Blekinge Institute of Technology, Sweden, 2004. [19] F. Shull, V. Basili, J. Carver, J. Maldonado, G. Travassos, M. Mendonca, and S. Fabbri. Replicating Software Engineering Experiments: Addressing the Tacit Knowledge Problem. In Proceedings International Symposium on Empirical Software Engineering, 2002., pages 7–16, 2002. [20] R. N. Taylor, N. Medvidovic, and E. M. Dashofy. Software Architecture: Foundations, Theory, and Practice. John Wiley & Sons, USA, 2009. R EFERENCES [1] V. R. Basili and H. D. Rombach. The TAME Project: Towards Improvement-Oriented Software Environments. Trans. Software Eng., 14(6):758–773, 1988. [2] V. R. Basili, F. Shull, and F. Lanubile. Building Knowledge through Families of Experiments. Trans. Software Eng., 25:456–473, 1999. [3] G. Bockle, P. Clements, J. D. McGregor, D. Muthig, and K. Schmid. Calculating ROI for Software Product Lines. IEEE Softw., 21(3):23–31, 2004. [4] S. R. Chidamber and C. F. Kemerer. A Metrics Suite for Object Oriented Design. Trans. Softw. Eng., 20(6):476–493, 1994. [5] P. Clements, R. Kazman, and M. Klein. Evaluating Software Architectures: Methods and Case Studies. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 2002. [6] E. Dincel, N. Medvidovic, and A. v. d. Hoek. Measuring Product Line Architectures. In International Workshop on Software Product-Family Engineering, pages 346–352, London, UK, UK, 2002. Springer-Verlag. 94 A SPL infrastructure for supporting scientific experiments in petroleum reservoir research field Fernanda Y. S. Foschiani Leonardo P. Tizzei Cecı́lia M. F. Rubira Instituto de Computação - Unicamp Campinas, São Paulo - Brasil Email: [email protected] Instituto de Computação - Unicamp Campinas, São Paulo - Brasil IBM Research, Brasil Email: [email protected] Instituto de Computação - Unicamp Campinas, São Paulo - Brasil Email: [email protected] Abstract—Computational resources are commonly used in the research field, in order to facilitate data and services sharing. The frequent study of new research methodologies, the software diversity, simulators and data involved in experiments, lead to the necessity of environments that provide facilities for technology use and matching. Aiming to support the software diversity, the proposed solution is a scientific workflow environment that allows the researchers to create their own personalized workflows, using components provided by the development team as well as developed by themselves, regardless of the language being used. The basis for this environment is a component based software product line. The proposed extractive method for the product line development is supported by a software reengineering framework and uses existing modeling techniques. One case study was performed to evaluate some aspects, including the components reuse enhancement and the workflow customization capability. This study case had a positive result, showing that the proposed solution allows the researchers to customize their workflows. I. I NTRODUÇ ÃO Recursos computacionais vêm sendo muito utilizados não só nas indústrias, como também no desenvolvimento da pesquisa, beneficiando o trabalho das comunidades cientı́ficas e facilitando o compartilhamento de dados e serviços computacionais. Esse é o contexto de e-Science, em que a computação torna-se parte integrante e fundamental para o sucesso na realização de pesquisas cientı́ficas das mais variadas áreas [1]. Nesse contexto está inserido o UNISIM [2], um grupo de pesquisa que atua na área de Simulação Numérica e Gerenciamento de Reservatórios de Petróleo. Os sistemas desenvolvidos pelo UNISIM visam fornecer aos pesquisadores da área de petróleo (engenheiros de reservatórios de petróleo, economistas da área, geólogos, etc.) infraestruturas de software que os auxiliem na tomada de decisões e que sejam úteis na automatização de tarefas que utilizam a simulação numérica de reservatórios. O uso destas infraestruturas, integradas às simulações numéricas, permitem pesquisadores a executarem experimentos cientı́ficos de forma a reproduzir ambientes reais e modelar comportamentos dinâmicos de reservatórios. Um experimento é uma forma bem conhecida pelos pesquisadores para apoiar a formulação de novas teorias. O workflow cientı́fico é definido como o template do experimento, onde a sequência das atividades estão descritas [3]. Ele é o principal recurso do experimento cientı́fico, e serve como uma abstração para representar o encadeamento de atividades para experimentação [4]. 95 Entre as infraestruturas de software desenvolvidas pelo UNISIM, está o sistema UNIPAR, desenvolvido em parceria com a Petrobras. O sistema UNIPAR apoia funcionalidades da área de simulação de reservatórios, como análises de incerteza e risco de desenvolvimento de campos de petróleo, análises econômicas, e distribuição e paralelização de simulações em clusters. Utilizando o UNIPAR, os pesquisadores podem executar experimentos cientı́ficos modelando diversos cenários prováveis de um reservatório através da combinação de variações de seus atributos, tais como porosidade e permeabilidade da rocha. Um dos principais desafios enfrentados pelos pesquisadores do UNISIM é a dificuldade de modificação do UNIPAR para a inclusão de novas teorias e metodologias de pesquisa nos experimentos. Pesquisadores desenvolvem suas ferramentas utilizando linguagens de programação como C, Python, e MatLab. Mas essas ferramentas são difı́ceis de serem integradas ao UNIPAR por serem desenvolvidas em linguagens que não são as utilizadas pelo sistema. O UNISIM conta com uma equipe de desenvolvimento e manutenção associada ao UNIPAR, e a cada nova metodologia criada pelo pesquisador, um pedido de modificação (do inglês change request) é enviada para essa equipe. Essa situação acaba por atrasar o processo de pesquisa, uma vez que o pesquisador precisa esperar até que uma nova versão do sistema seja liberada para continuar seus estudos. Do ponto de vista da equipe de desenvolvimento e manutenção, uma das maiores dificuldades do sistema UNIPAR é o fato de o sistema demandar um grande esforço de manutenção. Isso se deve ao fato de que o UNIPAR possui aproximadamente 223 KLOC e seu crescimento de forma não planejada acarretou em muito código duplicado e altamente acoplado. Alterações que teoricamente deveriam ser simples, acabam levando muito tempo para serem implementadas. Existem diversas soluções comerciais para apoiar a pesquisa na área de petróleo, como os simuladores numéricos de reservatórios, dentre os quais podemos citar os simuladores IMEX, STARS, GEM, ECLIPSE, sistemas para modelagem geológica de reservatórios, como o Petrel, sistemas que combinam simuladores e técnicas de otimização, como o CMOST, etc. A maioria dessas soluções são sistemas isolados que não permitem a comunicação e transferência de dados entre outros sistemas, necessitando de uma infraestrutura que realize a integração entre essas soluções. Esta transferência de dados é bastante complexa devido ao fato de cada sistema possuir diferentes tipos e formatos de dados de entrada e saı́da. Dentre os sistemas citados, o Petrel apoia a inclusão de novos componentes, entretanto, demanda o uso de linguagens de programação especı́ficas, o que muitas vezes é uma tarefa não trivial para pesquisadores da área de petróleo. Linhas de produtos de software (LPS) e desenvolvimento baseado em componentes (DBC) são técnicas de engenharia de software que possuem relatos na literatura ([5], [6]) indicando que podem contribuir para a redução de custo e tempo de manutenção de software. Este trabalho propõe uma infraestrutura de apoio à execução de experimentos cientı́ficos, tendo como base uma LPS baseada em componentes, chamada UNIPAR-LPS (desenvolvida a partir do sistema UNIPAR). A infraestrutura apoia a organização e armazenamento dos componentes e dados utilizados em experimentos cientı́ficos na área de petróleo, e seu uso permite aos pesquisadores combinar componentes da LPS integrando-os (ou não) a componentes desenvolvidos por eles ou terceiros, ou apenas reproduzir novas configurações com os mesmos componentes, facilitando assim o trabalho dos pesquisadores na criação de experimentos. Além disso, com a UNIPAR-LPS, pesquisadores podem reutilizar dados e conhecimento cientı́fico dos experimentos através da recuperação de dados experimentos gerados anteriormente. A solução foi implementada utilizando um método de extração de LPS proposto, que é baseado no arcabouço da Reengenharia Orientada a Caracterı́sticas (ROC) [7], combinando técnicas de modelagem existentes na literatura dos métodos PLUS, UML Components e FArM. Existem na literatura vários trabalhos que abordam LPS no contexto de e-Science. O trabalho de Ogasawara et. al [8] fornece uma abordagem para apoiar as etapas de concepção e utilização de workflows cientı́ficos utilizando conceitos de LPS. No trabalho de Costa et al. [9], utiliza-se as vantagens oferecidas pelo uso conjunto de modelos de caracterı́sticas e ontologias para a obtenção de uma LPS para geração de workflows cientı́ficos em um domı́nio previamente estabelecido. Soluções como Taverna [10] proveem ambientes de gerenciamento de workflows com o objetivo de realizar o projeto e a execução de experimentos cientı́ficos na web, voltados para a área de Ciências Biológicas. A maioria dos trabalhos citados têm o foco na concepção dos workflows, e não na inclusão de novos componentes. Diferentemente dos trabalhos citados acima, o foco deste trabalho é prover uma infraestrutura com a qual pesquisadores possam criar seus próprios experimentos (re)utilizando componentes prontos ou desenvolvidos por eles mesmos. Esses componentes podem ser desenvolvidos em qualquer linguagem de programação (por exemplo Python, MatLab, C). A principal contribuição deste trabalho é prover uma infraestrutura de apoio à pesquisa na área de petróleo, que possibilita pesquisadores incluir componentes próprios na LPS (sem a necessidade de conhecimento programação) e (re)executar experimentos cientı́ficos utilizando ou não componentes próprios, além de melhorar o reúso de experimentos em termos dados e conhecimento. Para tanto, definimos um método que mostra como pode-se extrair uma LPS a partir de um sistema legado, baseado no arcabouço da ROC. A solução foi avaliada utilizando experimentos cientı́ficos anteriormente criados por pesquisadores, extraı́dos do Banco de Casos, uma ferramenta web interna do UNISIM. Vários 96 experimentos cientı́ficos foram reproduzidos na UNIPAR-LPS a fim de verificar se: (i) a infraestrutura criada é capaz de substituir funcionalmente o sistema legado UNIPAR, (ii) verificar o reúso de funcionalidades comuns para módulos internos do UNIPAR, e (iii) verificar a possibilidade de inclusão de novos componentes à infraestrutura, tanto pelo pesquisador quanto pelo desenvolvedor. Os estudos de caso apresentaram resultados positivos, indicando que a solução proposta facilita a modificação da UNIPAR-LPS, e permite aos pesquisadores a personalização de fluxos de trabalho, auxiliando assim o processo de pesquisa cientı́fica. O restante deste artigo é apresentado da seguinte forma: a Seção II apresenta fundamentos teóricos sobre linhas de produtos de software, desenvolvimento baseado em componentes e estilos arquiteturais, necessários para o entendimento deste trabalho, bem como uma descrição do sistema UNIPAR. A Seção III apresenta alguns trabalhos relacionados. A abordagem proposta e a infraestrutura criada são detalhadas na Seção IV. Na Seção V são apresentados três estudos de caso e a análise dos resultados obtidos. Finalmente, na Seção VI são discutidas as conclusões deste artigo, destacando suas principais contribuições e indicando possı́veis trabalhos futuros. II. F UNDAMENTAÇ ÃO T E ÓRICA A. Variabilidade e Modelo de Caracterı́sticas Um dos principais conceitos de uma LPS é a variabilidade, que é a habilidade mudar ou customizar um sistema. Para definir e gerenciar os pontos comuns e as variabilidades em LPS são utilizados modelos de caracterı́sticas (do inglês feature model). Uma caracterı́stica (do inglês feature) é um aspecto importante e visı́vel ao usuário, uma qualidade ou funcionalidade de um software ou sistema [11]. Caracterı́sticas podem ser mandatórias (necessárias para todos os membros da LPS), opcionais (necessárias para alguns membros da LPS), e alternativas (sugere a existência de alternativas e que pelo menos uma deva ser selecionada). O modelo de caracterı́sticas representa as caracterı́sticas de uma famı́lia de produtos em um domı́nio, as relações entre elas, e suas semelhanças e diferenças (variabilidades). B. Reengenharia Orientada a Caracterı́sticas Abordagens extrativas de adoção de LPS aplicam o processo de reengenharia a sistemas de software já existentes para a criação da LPS. Os sistemas passam pelo processo de engenharia reversa para serem entendidos e para que sejam extraı́dos seus requisitos, e depois pelo processo de engenharia avante (durante a concepção da LPS). Reengenharia Orientada a Caracterı́sticas é um arcabouço proposto por Kang et al. [7] que utiliza a abordagem de reengenharia para a adoção extrativa de LPS. Ele é composto por cinco fases, sendo que as duas primeiras correspondem à engenharia reversa dos sistemas legados. Na Fase 1, Recuperação da Arquitetura, é realizada a extração de componentes e informações arquiteturais do sistema legado. Na Fase 2, Modelagem de caracterı́sticas, com base nas informações recuperadas e no conhecimento do domı́nio, é gerado um modelo de caracterı́sticas da aplicação legada. As três próximas fases correspondem à engenharia da LPS. Considerando possı́veis oportunidades de negócio e alterações de tecnologia, na Fase 3, Refinamento do modelo de caracterı́sticas, os modelos de caracterı́sticas dos sistemas legados são atualizados e unificados, dando origem ao modelo de caracterı́sticas refinado. Na fase de Projeto da LPS, Fase 4, com base no modelo de caracterı́sticas refinado e princı́pios de engenharia, são desenvolvidos os artefatos e projetada a nova arquitetura e componentes para a LPS. Na Fase 5, Implementação da LPS, a LPS é desenvolvida. C. Método PLUS O método PLUS (Product Line UML-Based Software Engineering) [12] é baseado em UML (Unified Modeling Language) e tem como objetivo explicitar as variabilidades e os pontos em comun de uma LPS através de casos de uso e modelos de caracterı́sticas. O método é composto de quatro etapas brevemente descritas a seguir. Modelagem de Requisitos: Assim como as caracterı́sticas, os casos de uso e seus atores são classificados como obrigatórios, opcionais, e alternativos, identificados pelos esteriótipos <<kernel>>, <<optional>>, e <<alternative>>, respectivamente. Esta etapa consiste da concepção do modelo de casos de uso, do modelo de caracterı́sticas, e, com os modelos prontos, é realizado o relacionamento entre eles. Análise para LPS: Composta de quatro modelos: (i) modelo estático, define as entidades estáticas e seus relacionamentos na LPS; (ii) modelo dinâmico, criado a partir de diagramas UML para representar a interatividade e mensagens do sistema; (iii) modelo dinâmico de máquina de estados, define os estados de cada entidade, seus ciclos de vida e proveem um meio de interpretação para elas; e (iv) modelo de dependência de caracterı́sticas, determina as dependências de cada classe com as caracterı́sticas dos sistemas. Projeto para LPS: Composta de duas fases: (i) arquitetura de software, fase na qual são definidos a estrutura arquitetural e os padrões de projeto para o desenvolvimento da LPS; (ii) projeto de software baseado em componentes, onde a LPS é dividida em componentes e definidas as interfaces entre eles. Engenharia da Aplicação de Software: São desenvolvidos dos membros da LPS, utilizando os modelos desenvolvidos nas etapas anteriores. Apesar do PLUS ser bastante detalhado nas duas primeiras etapas, ele é superficial em termos do projeto da LPS, onde considera-se a utilização de outro método em conjunto. Neste projeto utilizamos o método FArM. D. Método FArM O método FArM Feature-Architecture Mapping [13] propõe quatro transformações aplicadas no modelo de caracterı́sticas visando criar uma arquitetura inicial (baseada em componentes) de LPS. Remoção de caracterı́sticas NRA e resolução de caracterı́sticas de qualidade: Caracterı́sticas não relacionadas à arquitetura (NRA) são aquelas que não são consideradas requisitos significantes para a arquitetura, e podem ser fı́sicas ou de negócio. Essas caracterı́sticas são removidas do modelo 97 e depois aquelas que representam requisitos de qualidade são integradas àquelas que representam requisitos funcionais. Caso não existam caracterı́sticas funcionais passı́veis de serem integradas com as de qualidade, pode-se incluir uma nova caracterı́stica funcional para a integração. O objetivo desta integração é manter a representação das caracterı́sticas de qualidade na arquitetura. Transformação baseada nos requisitos arquiteturais: Requisitos arquiteturais que não tenham sido representados no modelo, mas que exercem influência sobre a arquitetura, devem ser representados explicitamente como caracterı́sticas e integrados com caracterı́sticas existentes ou criadas para representá-los, assim como ocorre com as caracterı́sticas de qualidade. O objetivo é contrabalancear a influência que usuários finais possuem na especificação do modelo de caracterı́sticas. Transformação baseada nas relações de interação entre caracterı́sticas: São especificadas as comunicações e introduzidas relações de interação entre elas. As relações podem ser de três tipos: (1) tipo 1 - usa (relaciona duas caracterı́sticas onde uma usa a funcionalidade de outra); (ii) tipo 2 - modifica (relaciona duas caracterı́sticas onde o funcionamento correto de uma altera o comportamento de outra), e (iii) tipo 3 contradiz (relaciona duas caracterı́sticas onde o funcionamento correto de uma contradiz o comportamento de outra). Após esta etapa, é feita uma nova transformação com base em dois critérios: (i) critério 1: o tipo de relação entre as caracterı́sticas, e critério 2: o número de relações de interação entre caracterı́sticas. Baseado nesses critérios, caracterı́sticas devem ser integradas ou novas caracterı́sticas podem ser adicionadas. Transformação baseada nas relações de hierarquia entre caracterı́sticas: Nessa transformação as caracterı́sticas são classificadas em dois tipos e de acordo com três relações de hierarquia: (i) supercaracterı́stica e (ii) subcaracterı́stica e as relações: (i) tipo especialização: uma subcaracterı́stica especializa uma supercaracterı́stica, (ii) tipo agregação: uma subcaracterı́stica é parte de uma supercaracterı́stica, e (iii) tipo alternativa: as subcaracterı́sticas representam alternativas para a supercaracterı́stica. O método propõe que todas as relações de hierarquia do modelo de caracterı́sticas transformado sejam de um dos três tipos definidos. Relações inválidas devem ser removidas e novas relações podem ser criadas. Feito isso, as supercaracterı́sticas são mapeadas para componentes que proveem acesso às funcionalidades dos componentes que implementam as subcaracterı́sticas. A variabilidade das caracterı́sticas e seus relacionamentos são mantidos nos componentes. A fim de criar uma arquitetura inicial da LPS, são utilizados estilos arquiteturais de referências para especificar as relações entre os componentes identificados. Esta arquitetura não é a final da LPS pois o método não prevê a identificação das interfaces, nem a especificação da assinatura de seus métodos. E. Processo UMLComponents e o Modelo COSMOS* O UML Components [6] é um processo de desenvolvimento de sistemas de software baseado em componentes, visando o reúso de componentes e a facilidade de manutenção e modificabilidade do sistema, obtendo assim um ganho de produtividade e qualidade. O processo UML Components propõe diagramas utilizados para especificar as interfaces dos componentes e suas relações, entre eles o diagrama de comunicação, utilizado para documentar como os objetos estão vinculados e quais mensagens trocam entre si durante o processo . COSMOS* (Component Structuring Model for Objectoriented Systems) [14] é um modelo de estruturação de componentes para mapeamento de arquiteturas de componentes em linguagens de programação. Ele define elementos para a implementação de componentes e conectores, participantes da composição de software. O modelo COSMOS* preocupa-se principalmente em garantir reusabilidade (promovida pelo baixo nı́vel de acoplamento do sistema, uma vez que os componentes interagem somente através de conectores, e um componente não tem conhecimento dos demais), adaptabilidade (possibilidade de modificar ou substituir um componente que participa de uma composição de software, isolando o maior número possı́vel de modificações na implementação dos conectores) e modificabilidade (apenas a interface do componente é conhecida do usuário, possibilitando assim a modificação da implementação do componente sem afetar a sua utilização). COSMOS* é independente de plataforma tecnológica, sendo possı́vel adaptar o modelo para o uso com C++, linguagem de desenvolvimento utilizada neste projeto. F. Sistema UNIPAR Como já dito na Seção I, o sistema UNIPAR é desenvolvido pelo UNISIM [2], e busca disponibilizar aos pesquisadores sistemas para apoiar a pesquisa cientı́fica na área de petróleo. O sistema conta com cinco módulos. Cada módulo pode ser entendido como uma parte do sistema responsável por uma linha de pesquisa especı́fica. Os módulos MAI e MEC, utilizados neste trabalho, são brevemente explicados a seguir. Módulo de Análise de Incertezas e Risco (MAI): Aborda a metodologia para a quantificação do impacto de incertezas geológicas na avaliação de reservatórios de petróleo. Tal metodologia baseia-se na simulação de modelos que representam os possı́veis cenários do reservatório, utilizando uma combinação dos atributos incertos que o caracterizam. Os modelos de simulação do MAI são construı́dos utilizando as técnicas estatı́sticas árvore de derivação (AD) e/ou hipercubo latino discreto (HCLD). Módulo de Análise Econômica (MEC): Realiza cálculos de ı́ndices econômicos para poços, grupos de poços e campos de petróleo, possibilitando o estudo da viabilidade de um projeto de exploração de petróleo. Para efetuar tais cálculos, são utilizadas informações resultantes de uma simulação do comportamento da produção no campo, aplicando parâmetros econômicos (preço do barril de óleo, custos de produção/injeção de fluidos, etc) definidos pelo usuário. Existem ainda ferramentas complementares a estes módulos, por exemplo conversores de arquivos de saı́da (formato XML) para formato texto ou planilhas Excel, e validadores de dados de entrada e saı́da, que também não foram consideradas no projeto. Como mencionado na Introdução, o sistema UNIPAR é complexo, seu código é fortemente acoplado, e foi implementado por pessoas distintas e utilizando linguagens distintas, tais 98 como Java, C# e C/C++. Das 223 KLOC do UNIPAR, 136 KLOC foram consideradas neste trabalho O módulo MAI tem 82 KLOC e o MEC 54 KLOC. III. T RABALHOS R ELACIONADOS Costa et al. [9] propõe a PL-Science, uma abordagem que combina ontologias e modelos de caracterı́sticas para o desenvolvimento de LPSs cientı́ficas, ou seja, que utiliza workflows cientı́ficos. A abordagem PL-Science se beneficia da semântica especificada pelas ontologias e da variabilidade especificada nos modelos de caracterı́sticas para facilitar a escolha e organização de workflows cientı́ficos. A abordagem pode ser usada de forma complementar a nossa abordagem, uma vez que a PL-Science foca na especificação de modelos do domı́nio, nossa proposta foca na especificação do projeto arquitetural de LPSs. Ogasawara et al. [8] introduz o conceito de linhas de experimento que é uma abordagem sistemática para composição de workflows cientı́ficos em diferentes nı́veis de abstração. Esse nı́veis de abstração são de certa forma similares ao conceito de variabilidade em LPSs, ou seja, atividades podem ser modeladas com um nı́vel de abstração alto (i.e. pontos de variação) e cada atividade pode ser materializada por diferentes atividades concretas (i.e. variantes). Além disso, essa abordagem é apoiada por uma ferramenta chamada de GExpLine [15]. Tanto o trabalho de Ogasawara et al. quanto esta proposta visam aumentar o reúso de software no contexto de workflows cientı́ficos. Contudo, enquanto Ogasawara et al. foca na gerência de variabilidade por meio do conceito de linha de experimento (e suas abstrações), nossa abordagem visa construir o workflow usando técnicas de LPSs e componentes. Acher et al. [16] descrevem uma abordagem rigorosa para modelar,compor serviços parametrizados em um workflow cientı́fico. Essa abordagem utiliza práticas de LPSs para modelar a variabilidade e apoiar a composição dos serviços de forma consistente. Uma similaridade entre a abordagem de Acher et al. e a nossa é que ambas utilizam o modelo de caracterı́sticas para identificar a variabilidade do workflow. Entretanto Acher et al. focam na modelagem, composição e verificação da consistência dos serviços que compoem um workflow, esta abordagem na modelagem e projeto arquitetural baseado em componentes que compõem um workflow. Taverna [10] é um ambiente de gerenciamento de workflow cientı́fico utilizada para compor e executar experimentos cientı́ficos na área de Bioinformática, disponibilizando acesso a diversos web-services da área. Apesar de ser um ambiente bastante completo, ele exige o uso de uma linguagem de manipulação e execução própria do Taverna, deixando os cientistas dependentes do ambiente de execução. Na área petrolı́fera, pode-se citar o software Petrel[17], um software comercial muito utilizado na pesquisa cientı́fica, desenvolvido pela empresa Schlumberger. A partir do Petrel, cientistas podem desenvolver workflows colaborativos e integrar conhecimento de diferentes disciplinas. É possı́vel desenvolver plug-ins personalizados para o Petrel por meio do framework Ocean, que disponibiliza ferramentas .NET e oferece interfaces estáveis e amigáveis para o desenvolvimento de software. O Petrel só aceita plug-ins desenvolvidos a partir deste framework. A desvantagem é que um pesquisador não pode integrar seu próprio plug-in ao Petrel, a não ser que ele tenha o desenvolvido utilizando Ocean. IV. S OLUÇ ÃO P ROPOSTA A solução proposta neste trabalho é uma infraestrutura de apoio à execução de experimentos cientı́ficos na área de petróleo, que tem como base uma LPS baseada em componentes, chamada UNIPAR-LPS, desenvolvida a partir do sistema legado UNIPAR. A infraestrutura foi implementada utilizando um método de extração de LPS baseado no arcabouço da Reengenharia Orientada a Caracterı́sticas (ROC) [7]. A escolha do arcabouço ROC para a extração da LPS deve-se ao fato de ele propor a extração de uma LPS a partir de um sistema legado, que é o caso do UNIPAR. Além disso, o ROC trata tanto a análise de requisitos (feita na etapa de engenharia reversa do sistema legado) quanto a etapa de análise, projeto e implementação da LPS baseada em caracterı́sticas. Associado ao ROC, utilizamos técnicas de modelagem do método PLUS, o método FArM para a criação da arquitetura inicial da LPS baseada em componentes e técnicas do processo UMLCompnents para a especificação das interfaces entre os componentes. Apesar de não existir nenhuma particularidade que impeça seu uso em outro domı́nio, a infraestrutura proposta foi testada apenas no domı́nio cientı́fico da área de petróleo. Por isso, não podemos afirmar que ela é genérica. Por questão de espaço, nem todos os artefatos gerados durante a construção da solução proposta foram exibidos. Os artefatos podem ser vistos na dissertação referente a este projeto [18]. Figura 1. Visão geral da infraestrutura Desenvolvedores mantêm a LPS atualizando componentes e adicionando novos componentes ao núcleo de artefatos. Pesquisadores podem utilizar componentes existentes, implementar novos componentes e utilizá-los em seus workflows. B. Execução de UNIPAR-LPS Experimentos Cientı́ficos Usando a Como mostra a 2, para a execução de um experimento cientı́fico utilizando o sistema UNIPAR, são necessárias três atividades: (1) Planejamento do Experimento, (2) execução do UNIPAR e (3) Análise dos Resultados. Utilizando a UNIPARLPS, a execução é parecida, porém a atividade (2) é dividida em três passos: (2.1) onde o usuário resolve as variabilidades da UNIPAR-LPS a fim de selecionar as caracterı́sticas que farão parte do produto que será gerado, (2.2) onde o usuário gera o arquivo de especificação do workflow, indicando quais componentes e parâmetros deverão ser executados e em que ordem, e (2.3) é a execução do workflow através da linha de comando. São estes passos que permitem a geração de produtos personalizados. A. Visão Geral da Solução Proposta A infraestrutura fornecida será utilizada tanto pelos pesquisadores do UNISIM, para a composição de workflows cientı́ficos que definem os experimentos cientı́ficos, quanto pela equipe de desenvolvimento do UNISIM, que dará continuidade ao desenvolvimento e manutenção da UNIPAR-LPS. Uma caracterı́stica na UNIPAR-LPS é sempre mapeada para um componente ou parâmetro de componente, e quando mapeada para um componente, a adição desta nova caracterı́stica pode ser feita pelos próprios pesquisadores, sem necessidade do trabalho da equipe de desenvolvimento. O componente deve ser adicionado ao núcleo de artefatos da LPS, e é independente de linguagem de programação. Se esses componentes não fazem parte de nenhum dos produtos (i.e. experimentos) da LPS, então sua utilização neste contexto pode trazer inconsistências. Por isso, componentes desenvolvidos por terceiros devem ser incluı́dos na LPS de forma consistente com os demais componentes. Existem regras de composição no modelo de caracterı́sticas. A infraestrutura proposta permite a criação de produtos personalizados para os pesquisadores, gerados utilizando componentes da UNIPAR-LPS, ou componentes desenvolvidos por eles mesmos ou terceiros, desde que os componentes respeitem as interfaces de entrada e saı́da dos componentes que com eles se conectam. A Figura 1 mostra como desenvolvedores e pesquisadores interagem com a infraestrutura. 99 Figura 2. Atividades para a execução do UNIPAR e da UNIPAR-LPS É necessário que o usuário especifique algumas informações sobre o experimento, tais como, o simulador que será utilizado, nome do modelo base de simulação, atributos para o projeto, etc. Esses dados iniciais devem ser informados em um arquivo de entrada comum a todos os componentes da LPS, em formato XML. Logo no inı́cio da execução, o componente central da UNIPAR-LPS, responsável pela execução do workflow, gera uma cópia deste arquivo com o nome default.xml para que todos os componentes da UNIPAR-LPS consigam encontrar as informações necessárias. Também através de um arquivo XML, o pesquisador compõe o workflow utilizando os componentes existentes e/ou adicionando seus próprios componentes, desde que os novos componentes respeitem as interfaces de entrada e saı́da dos componentes que com eles se conectam. A UNIPAR-LPS possui um componente central que é o responsável por validar o workflow gerado e continuar ou não a execução. Também é neste arquivo que são informados os parâmetros e seus respectivos valores para cada componente que será executado. A infraestrutura proposta é independente de linguagem de programação, uma vez que o fluxo de dados entre componentes ocorre apenas por meio de informações persistidas em arquivos. Para garantir o compartilhamento de dados, propõe-se a criação de um novo componente, responsável por armazenar na LPS as informações dos experimentos executados, e também responsável por copiar um experimento para uma área em disco compartilhada. Muitas vezes, dados de experimentos são confidenciais, principalmente quando o experimento envolve um reservatório real e não sintético. Por isso, é necessário tomar cuidado com a privacidade dos dados. Propomos que o pesquisador indique ao componente de compartilhamento de dados, através de um parâmetro, se ele deseja que seus dados sejam públicos ou privados. Caso sejam públicos, podese copiar o experimento para o local compartilhado com permissão do diretório para todo o grupo de pesquisa. Caso seja privado, deve-se copiar com permissão apenas para o pesquisador responsável pelo experimento. O método proposto para a adoção de LPS foi aplicado ao sistema UNIPAR para a construção da UNIPAR-LPS. Cada uma das fases do método é explicada, mostrando como foi aplicada ao sistema UNIPAR. D. Fase 1: Engenharia Reversa da UNIPAR-LPS A Engenharia Reversa pode ser definida como um processo de análise de um sistema, a fim de identificar os componentes desse sistema e seus relacionamentos, e criar representações do sistema em uma outra forma ou em um nı́vel mais alto de abstração. Ela objetiva a extração de documentação de projeto a partir do código, representando uma atividade ou um conjunto de técnicas que visam facilitar a compreensão de um sistema existente. Conforme proposto pelo arcabouço da ROC, esta fase é composta por duas atividades: (1.1) Recuperação da Arquitetura e (1.2) Modelagem de caracterı́sticas dos sistemas legados. A fase tem como entrada o código legado do sistema UNIPAR e como saı́das a arquitetura recuperada e o modelo de caracterı́sticas do sistema legado, como visto na Figura 4. C. Construção da UNIPAR-LPS Como já visto na Seção II, uma das maneiras de adotar a engenharia de LPS é utilizando uma abordagem extrativa. Ela é apropriada nos casos em que se têm artefatos de sistemas existentes que podem ser reutilizados, e ainda mais apropriada quando os sistemas têm uma quantidade significativa de partes em comum e também diferenças significativas entre eles [19], como é o caso do sistema UNIPAR. Além disso, não é necessário realizar extração de todos os sistemas preexistentes de uma só vez, pode-se fazer incrementalmente. Por estes motivos, a abordagem extrativa foi adotada para a realização deste projeto. Propomos um método extrativo para o desenvolvimento de uma LPS baseada em componentes, implementada a partir de um sistema legado. As metodologias utilizadas foram escolhidas pelo fato de todos os modelos propostos por elas, com exceção do modelo de caracterı́sticas, serem baseados em UML, e são conhecidos boa parte dos desenvolvedores e tem bastante apoio de ferramentas. Esse método foi aplicado ao sistema UNIPAR para a geração da UNIPAR-LPS. Para facilitar a compreensão deste método, pode-se representá-lo através de quatro fases principais (Figura 3): (1) Engenharia Reversa da LPS, (2) Análise da LPS, (3) Projeto da LPS e (4) Implantação da LPS. Figura 4. Fase de Engenharia Reversa do método proposto Fase 1.1: Recuperação da Arquitetura do UNIPAR. A recuperação da arquitetura visa melhor compreensão do sistema legado, e assim auxiliar na migração para uma abordagem de engenharia de linhas de produtos. As arquiteturas legadas são analisadas e se possı́vel utilizadas na implementação da LPS, mais precisamente para o desenvolvimento do núcleo de artefatos. Uma vez que o método proposto é baseado em caracterı́sticas, o modelo de caracterı́sticas é essencial para o desenvolvimento da LPS. A recuperação da arquitetura neste projeto é responsável por identificar as relações entre os módulos do sistema UNIPAR e entender seu funcionamento para, a partir dos artefatos obtidos, construir o modelo de caracterı́sticas do sistema legado. Sugere-se aqui a recuperação de uma visão estática e uma visão dinâmica. Pelo fato do UNIPAR ser um sistema grande e possuir código estruturado, optou-se por recuperar uma visão estática de alto nı́vel, apenas para documentar o relacionamento de seus módulos. Foi gerado um diagrama com uma visão macro da comunicação entre os módulos e as bibliotecas. Para compreender o funcionamento de cada módulo, foram gerados workflows de alto nı́vel e detalhados para cada um deles. Figura 3. A partir dos workflows é possı́vel identificar as tarefas que o módulo deve realizar e também como a escolha de parâmetros de execução influencia seu funcionamento. A recuperação da Método de extração de LPS proposto (adaptado de [20]) 100 Figura 5. Figura 6. Fase 2: Análise da LPS arquitetura foi realizada sem o uso de técnicas ou ferramentas automatizadas, mas apenas com o conhecimento do domı́nio. Fase 1.2: Modelagem de Caracterı́sticas do UNIPAR. Com base nos workflows gerados, construiu-se um modelo de caracterı́sticas para cada um dos módulos do sistema UNIPAR. A notação para a modelagem de caracterı́sticas utilizada no projeto é a notação proposta por Kang et al. [11]. Fase 3: Projeto da LPS Fase 2.3: Relacionamento entre Casos de Uso e Modelo de caracterı́sticas. Tanto os casos de uso quanto o modelo de caracterı́sticas representam requisitos do sistema. O método PLUS sugere que sejam criadas tabelas para representar os relacionamentos entre os casos de uso e as caracterı́sticas, que não são necessariamente de um-para-um. Eventualmente um caso de uso pode se associar a mais de um elemento no modelo de caracterı́sticas e vice-versa. Nesses modelos foram representadas todas as caracterı́sticas de cada um dos módulos, indicando quais são obrigatórias, opcionais e alternativas, bem como explicitadas as regras de composição. Todas as atividades do fluxograma foram mapeadas em uma ou mais caracterı́sticas. A rastreabilidade da variabilidade entre os dois modelos facilita a manutenção e evolução da variabilidade [21], uma vez que ao alterar um caso de uso sabe-se exatamente qual ou quais caracterı́sticas são afetadas com esta alteração. Ou seja, o relacionamento feito entre os casos de uso e caracterı́sticas são fundamentais para a evolução da LPS. E. Fase 2: Análise da UNIPAR-LPS Para realizar este relacionamento foram geradas três tabelas. A primeira foi alimentada com todos os casos de uso gerados. A segunda foi gerada contendo todas as caracterı́sticas do modelo de caracterı́sticas da LPS. Por fim, a última tabela é a responsável por fazer o relacionamento entre os casos de uso da primeira tabela com as caracterı́sticas presentes na segunda. Por falta de espaço, essas tabelas não foram apresentadas no artigo mas podem ser encontradas na dissertação [18]. Durante esta fase, é desenvolvido um modelo de requisitos onde os requisitos funcionais do sistema são definidos em termos de atores e casos de uso. Ela é composta por três atividades: (2.1) Modelagem de Caracterı́sticas, (2.2) Modelagem de Casos de Uso e (2.3) Relacionamento entre Caracterı́sticas e Casos de Uso. O artefato de entrada desta fase é o modelo de caracterı́sticas do sistema legado, gerado na Fase 1. Como saı́das da fase, é gerado um modelo de caracterı́sticas refinado, modelos de casos de uso, e é criada uma tabela relacionando os casos de uso e as caracterı́sticas. Fase 2.1: Modelagem de caracterı́sticas da UNIPARLPS. Nesta fase, o objetivo é gerar o modelo de caracterı́sticas da LPS. Para tanto, os modelos de caracterı́sticas dos módulos do sistema UNIPAR foram analisados, unificados, refinados, e alguns novos requisitos foram adicionados, gerando um único modelo de caracterı́sticas que representa a UNIPAR-LPS. Fase 2.2: Modelagem de Casos de Uso. Para especificar os requisitos funcionais de uma LPS, é importante capturar os requisitos comuns a todos os membros da famı́lia (obrigatórios), os opcionais e alternativos. Assim sendo, foram gerados casos de uso para todos os cenários de execução dos módulos do sistema UNIPAR, identificando-os com os esteriótipos <<kernel>>, <<optional>>, e <<alternative>>. 101 F. Fase 3: Projeto da UNIPAR-LPS Esta fase é composta por três atividades: (3.1) Aplicação do Método FArM, utilizado para a geração de uma arquitetura inicial para a LPS, (3.2) Especificação de Diagramas de Comunicação e (3.3) Especificação de Interfaces. A entrada desta fase é o modelo de caracterı́sticas refinado e os modelos de casos de uso, gerados na Fase 2, e como saı́da temos a arquitetura final da LPS, diagramas de comunicação e documentos de especificação das interfaces dos componentes. As seções a seguir detalham as atividades desta fase. Fase 3.1: Método FArM. O método FArM (Seção II-D) é composto de quatro transformações feitas em cima do modelo de caracterı́sticas refinado, sendo que a última transformação resulta na arquitetura sugerida, baseada em componentes. O modelo de caracterı́sticas resultante após as transformações do método FArM tem 33 caracterı́sticas. Após as transformações, as caracterı́sticas são mapeadas para componentes. As supercaracterı́sticas são mapeadas para componentes que proveem acesso às funcionalidades dos componentes que implementam as subcaracterı́sticas. As relações entre as caracterı́sticas são mantidas nos componentes. Com base em estilos arquiteturais e arquiteturas de referência, foram especificadas as relações entre os componentes resultantes da quarta transformação. Fase 3.2: Especificação das Interfaces de Componentes. A última atividade da Fase 3 consiste na especificação das interfaces entre os componentes. Com base nos componentes identificados na arquitetura obtida após a execução do método FArM, Fase 3.2, e com base nos casos de uso gerados na Fase 2, foram gerados diagramas de comunicação que representam todos os possı́veis fluxos de casos de uso (i.e., obrigatórios, opcionais e alternativos). Estes diagramas serão utilizados na próxima atividade (Fase 3.3), para a especificação das interfaces providas e requeridas de cada componente. No caso deste projeto, escolhemos a arquitetura pipeand-filter para a arquitetura da LPS, onde os filtros são os componentes e os pipes são arquivos. Ou seja, os componentes comunicam-se através de arquivos. O UNIPAR utiliza arquivos formato XML, e este formato foi mantido para facilitar a migração para a LPS. Foram gerados arquivos do tipo XSD para todos os formatos de arquivos aceitos pelos componentes da LPS, e utilizamos documentos para especificá-los. G. Fase 4: Implementação da UNIPAR-LPS Nesta fase os componentes são implementados. No caso deste projeto, os produtos não são gerados nesta fase, uma vez que o pesquisador resolve a variabilidade e escolhe as caracterı́sticas que seu produto terá no momento da concepção do workflow. Estas escolhas são informadas em um arquivo de especificação que representa um workflow cientı́fico. Com base nos modelos gerados nas atividades anteriores, os componentes da LPS são desenvolvidos de forma incremental, populando o núcleo de artefatos. Neste projeto, os componentes foram desenvolvidos a partir do código do sistema UNIPAR e foi utilizado o modelo COSMOS* (Seção II-E) para a padronização. A maioria do código do sistema UNIPAR é escrito nas linguagens C/C++ e por este motivo C++ foi a linguagem escolhida para a implementação dos componentes. A variabilidade do sistema foi implementada em dois nı́veis: (i) em nı́vel arquitetural, através da inserção e/ou remoção de componentes variantes (i.e. opcionais ou alternativos), (ii) em nı́vel de implementação, através da parametrização dos componentes. Ou seja, existem componentes obrigatórios, opcionais e alternativos, assim como existem parâmetros para especificar o comportamento requerido de cada componente. Como dito na Seção IV, na UNIPAR-LPS a variabilidade foi implementada tanto em nı́vel arquitetural quanto por parâmetros de execução. Foi gerado um documento (Figura 7), a partir do modelo de caracterı́sticas após a última transformação do FArM, com informações de como a caracterı́stica foi implementada na UNIPAR-LPS. Com este documento o pesquisador pode visualizar e escolher as caracterı́sticas que 102 seu produto terá, ou seja, resolver a variabilidade da LPS, e gerar o arquivo de especificação do workflow. Em alguns componentes os parâmetros indicam qual subcomponente deve ser executado. Pode-se citar como exemplo o componente Metodologia, mostrado na Figura 1. Ele tem quatro subcomponentes que especificam o tipo de metodologia a ser utilizada: AS, AD, HCLD e EC. Para que seja executada a metodologia AS, por exemplo, deve-se escolher o componente Metodologia e passar o parâmetro ‘-s‘. Foi gerado um componente chamado Workflow, para o apoio à execução dos experimentos cientı́ficos. É este componente que implementa a função principal da arquitetura callreturn da solução proposta. Ele é responsável pela validação do workflow cientı́fico a ser executado, bem como responsável pela chamada dos componentes que o compõem. A definição do fluxo de trabalho feita pelo pesquisador é a premissa do workflow cientı́fico. V. E STUDOS DE C ASO Foram conduzidos três estudos de caso a fim de avaliar qualitativamente a solução, acerca de três objetivos principais: (i) verificar se a solução proposta é viável; (ii) a reusabilidade de componentes e workflows cientı́ficos, e (iii) capacidade de modificação da UNIPAR-LPS. A análise qualitativa se mostrou adequada uma vez que ela visou o entendimento de como a abordagem afetou o contexto onde foi aplicada, e não uma generalização. Os experimentos utilizados nos estudos de caso foram extraı́dos do Banco de Casos do UNISIM, uma ferramenta web interna onde são armazenados todos os experimentos realizados pelos pesquisadores do grupo. A. Estudo de Caso: Reconstrução do Produto MAI Neste estudo de caso, o Produto MAI foi reconstruı́do a partir dos artefatos disponibilizados pela UNIPAR-LPS e sua execução foi comparada com a execução do Módulo MAI do sistema UNIPAR. Para tanto, executou-se um experimento já existente feito utilizando o MAI, que trabalha com o campo de petróleo de Namorado, um campo real, localizado na Bacia de Campos, no Rio de Janeiro. Para este estudo de caso executamos o MAI utilizando a metodologia Análise de Sensibilidade (AS). O propósito deste estudo de caso é verificar a capacidade da UNIPAR-LPS de reproduzir o módulo MAI do sistema legado UNIPAR. Para isso, executamos o MAI utilizando a metodologia de Análise de Sensibilidade, com um experimento já existente. Depois especificamos um workflow que gerou o produto MAI, derivado da UNIPAR-LPS, e executamos o mesmo experimento no produto gerado. Ao final das execuções, analisamos os resultados obtidos e os arquivos gerados. A fim de facilitar a migração, os arquivos de entrada de dados necessários para os módulos do UNIPAR foram mantidos na UNIPAR-LPS. Para escolher os componentes e parâmetros a fim de reproduzir exatamente a execução do sistema UNIPAR, utilizamos um documento gerado na Fase 4 (Seção IV-G) do método proposto, para o qual foi dado o nome de Configura Produto. A Figura 7 mostra parte do modelo com as variabilidades resolvidas (marcadas na cor cinza) para este experimento. As linhas tracejadas informam como a caracterı́stica foi implementada. MEC. Apesar dos módulos MAI e MEC do sistema UNIPAR terem partes iguais, eles possuem muito código duplicado. Neste estudo utilizou-se o workflow cientı́fico especificado no estudo de caso anterior (Seção V-A), alterando apenas o parâmetro do componente metodologia para que ele executasse a metodologia Econômica. Para validar o produto gerado, comparamos sua execução com a execução do módulo MEC. Ao final das execuções, os resultados obtidos e arquivos gerados foram analisados. Os arquivos de entrada de dados do módulo MEC foram mantidos na UNIPAR-LPS. Sendo assim, as informações que seguem referentes à preparação dos arquivos de entrada são válidas tanto para o MEC quanto para a UNIPAR-LPS. O workflow gerado para o estudo de caso anterior foi reaproveitado na execução deste estudo de caso, apenas foi alterado o parâmetro enviado ao componente metodologia, que agora passa a chamar o componente de cálculos econômicos. Foi utilizado novamente o documento Configura Produtos para escolher os componentes e parâmetros a fim de reproduzir exatamente a execução do módulo MEC do sistema UNIPAR. Figura 7. Depois de todos os arquivos gerados corretamente no diretório de trabalho, a UNIPAR-LPS foi executada. Variabilidade resolvida para a reconstrução do produto MAI Para utilizar as supercaracterı́sticas, deve-se utilizar o componente indicado. As subcaracterı́sticas são escolhidas através de parâmetros para os componentes de sua supercaracterı́stica. Para simplificar a figura, foram exibidos somente os componentes e parâmetros utilizados neste passo do estudo de caso. Além do arquivo que especifica as informações básicas da execução do experimento (arquivo base, simulador, tempos, etc), é necessária a geração do arquivo que especifica o workflow cientı́fico. Neste arquivo são especificados todos os componentes e parâmetros de execução que compõem o produto que será gerado. Tendo os componentes e parâmetros escolhidos, geramos o arquivo mostrado na Figura 8. Figura 8. Workflow que representa a execução do MAI Depois de todos os arquivos gerados corretamente no diretório de trabalho, a UNIPAR-LPS foi executada. B. Estudo de Caso: Reconstução do Produto MEC Neste estudo de caso a execução da metodologia de Análise de Econômica da LPS foi comparada à execução do módulo MEC, mas o foco é o reúso dos componentes utilizados no estudo de caso anterior. Neste estudo, executou-se um experimento já existente do MEC. O objetivo deste estudo de caso foi avaliar a capacidade da UNIPAR-LPS em relação à reutilização de componentes de software e também avaliar a reconstrução do produto 103 C. Estudo de Caso: Adição de um componente na UNIPARLPS No caso particular da solução proposta neste trabalho, uma nova caracterı́stica, por exemplo uma nova metodologia de estudo de campos de petróleo, pode ser adicionada à LPS pelo usuário final, e pode corresponder a um componente de software desenvolvido pelo próprio pesquisador, como também pode ser algum sistema de software desenvolvido por terceiros. O objetivo deste estudo de caso é avaliar a adição de uma nova variante à LPS e utilizá-la na execução de um experimento cientı́fico. A intenção deste estudo de caso foi avaliar a facilidade de modificação da UNIPAR-LPS através da inserção de uma nova variante à UNIPAR-LPS. Utilizou-se o mesmo fluxo de trabalho especificado no Estudo de Caso: Reconstrução do Produto MAI, apresentado na Seção 8, apenas alterando a metodologia utilizada para a metodologia desenvolvida pelo novo componente. Para que uma nova metodologia fosse incluı́da à LPS, foi necessário criar uma nova variante no modelo de caracterı́sticas, em um ponto de variação existente, neste caso Metodologia, e a implementação desta caracterı́stica é dada através da inclusão de um novo artefato no núcleo de artefatos da UNIPAR-LPS. Criamos um componente ”caixa-preta” desenvolvido em MatLab que, tendo como entrada um modelo numérico de um reservatório, gera 10 modelos derivados. O intuito do componente desenvolvido é substituir o componente Metodologia disponı́vel na UNIPAR-LPS por um novo componente. Este novo componente poderia corresponder à implementação de qualquer nova metodologia desenvolvida por um pesquisador do UNISIM, em qualquer linguagem de programação. Para que o componente possa ser utilizado no experimento, é necessário que sua interface de saı́da siga a especificação de interface de entrada definida para o próximo componente que será executado pelo workflow. O componente criado foi chamado de metodologia-personalizada. É necessário que o executável do novo componente seja adicionado no diretório de executáveis da UNIPAR-LPS. Trabalhos futuros que complementam este projeto envolvem a integração da UNIPAR-LPS a uma máquina de execução de workflows cientı́ficos. R EFER ÊNCIAS Depois de todos os arquivos gerados corretamente no diretório de trabalho, a UNIPAR-LPS foi executada. [1] D. Resultados [2] Pelo fato dos resultados gerados na execução dos módulos do sistema UNIPAR serem idênticos aos resultados gerados na execução dos produtos gerados pela UNIPAR-LPS (produtos gerados a partir da especificação dos workflows), fica claro que a UNIPAR-LPS atingiu o primeiro objetivo, substituir o sistema legado UNIPAR. Pode-se concluir que o método proposto para o desenvolvimento de uma LPS a partir de um código legado é viável, uma vez que a partir dele conseguimos coletar as caracterı́sticas do sistema UNIPAR e reproduzi-las na UNIPAR-LPS. No total, 18 das 32 caracterı́sticas existentes na UNIPAR-LPS foram testadas nos estudos de caso e funcionaram corretamente. Os dois primeiros estudos de caso especificaram workflows cientı́ficos que geraram produtos derivados da UNIPAR-LPS capazes de reproduzir os módulos MAI e MEC do sistema UNIPAR. Os dois produtos gerados pela UNIPAR-LPS utilizavam o mesmo conjunto de componentes, mostrando assim a possibilidade de reutilização de componentes em diferentes produtos. O terceiro estudo de caso mostrou que é possı́vel a inclusão de um novo componente à LPS, independente da linguagem de programação utilizada para sua implementação, sem a intervenção da equipe de desenvolvimento do UNISIM. Os três estudos de caso apresentaram resultados positivos, indicando que a UNIPAR-LPS é capaz de substituir o sistema UNIPAR e mostrando-se fácil de ser adaptada para a utilização de novas metodologias de pesquisa, através da adição de novos componentes. [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] VI. C ONCLUS ÕES E T RABALHOS F UTUROS Este trabalho apresenta uma infraestrutura para executar workflows cientı́ficos, e a base desta infraestrutura é uma LPS em componentes, chamada UNIPAR-LPS, desenvolvida a partir do sistema UNIPAR. Foi proposto um método extrativo de adoção de LPS a partir de sistemas legados, e este método foi utilizado para implementar a UNIPAR-LPS a partir do UNIPAR. Utilizando a infraestrutura criada, a variabilidade da UNIPAR-LPS é resolvida pelo pesquisador no momento da composição do workflow cientı́fico. Os produtos são gerados de maneira personalizada, permitindo pesquisadores a compor seus experimentos utilizando os componentes prontos da UNIPAR-LPS e adicionando (ou não) componentes desenvolvidos por eles ou terceiros, sem a necessidade da solicitação de alteração para a equipe de desenvolvimento. Os resultados dos estudos de caso realizados proveram evidências que a manutenção do código tornou-se mais fácil, a capacidade de modificação do sistema, compartilhamento de dados e reúso de componentes foram melhorados. 104 [15] [16] [17] [18] [19] [20] [21] D. C. Pallazi, “Qdaontology – abordagem para o desenvolvimento de ontologias em e-science: um estudo de caso em biologia,” Master’s thesis, Universidade Federal de Juı́z de Fora, 2010. “Portal UNISIM,” uRL: http://www.unisim.cepetro.unicamp.br/. Acesso em: 02 out. 2012. E. Deelman, D. Gannon, M. Shields, and I. Taylor, “Workflows and e-Science: An overview of workflow system features and capabilities,” Future Generation Computer Systems, vol. 25, no. 5, pp. 528–540, 2009. E. Ogasawara, L. Murta, C. Werner, and M. Mattoso, “Linhas de experimento: Reutilização e gerência de configuração em workflows cientı́ficos,” in 2nd Wksp. on E-Science, 2008, pp. 31–40. K. C. Kang, S. Kim, J. Lee, K. Kim, E. Shin, and M. Huh, “Form: A feature-oriented reuse method with domain-specific reference architectures,” Annals of Software Engineering, vol. 5, pp. 143–168, 1998. J. Cheesman and J. Daniels, Eds., UML Components. Addison-Wesley, 2001. K. Kang, M. Kim, J. Lee, and B. Kim, “Feature-oriented re-engineering of legacy systems into product line assets–a case study,” Software Product Lines, pp. 45–56, 2005. E. Ogasawara, C. Paulino, L. Murta, C. Werner, and M. Mattoso, “Experiment line: Software reuse in scientific workflows,” in Scientific and Statistical Database Management, ser. Lecture Notes in Computer Science, M. Winslett, Ed. Springer, 2009, vol. 5566, pp. 264–272. G. C. B. Costa, R. Braga, J. M. N. David, F. Campos, and W. Arbex, “PL-Science: A scientific software product line,” Procedia Computer Science, vol. 18, pp. 759–768, 2013. “myGrid Project,” uRL: http://www.mygrid.org.uk/. Acesso em: 27 jul. 2012. K. C. Kang, S. Cohen, J. Hess, W. Novak, and S. Peterson, “Featureoriented domain analysis,” CMU/SEI, Tech. Rep., 1990. H. Gomaa, Designing Software Product Lines with UML: From Use Cases to Pattern-Based Software Architectures. Addison Wesley, 2004. P. Sochos, M. Riebisch, and I. Philippow, “The feature-architecture mapping (FArM) method for feature-oriented development of software product lines,” in Proc. of the Annual IEEE Intl. Symposium and Wksp. on Engineering of Computer Based Systems. IEEE Computer Society, 2006, pp. 308–318. L. A. Gayard, C. M. F. Rubira, and P. A. C. Guerra, “COSMOS*: a COmponent System MOdel for Software Architectures,” Instituto de Computação, UNICAMP, Tech. Rep. IC-08-04, 2008. D. de Oliveira, E. Ogasawara, F. Seabra, V. Silva, L. Murta, and M. Mattoso, “GExpLine: a tool for supporting experiment composition,” in Provenance and Annotation of Data and Processes. Springer, 2010, pp. 251–259. M. Acher, P. Collet, P. Lahire, and R. France, “Managing variability in workflow with feature model composition operators,” in Software Composition. Springer, 2010, pp. 17–33. “Petrel E&P Software Platform,” uRL: http://www.slb.com/services/software/geo/petrel.aspx/. F. Y. S. Foschiani, “Uma abordagem de linhas de produtos de software para apoiar e-science,” Master’s thesis, Instituto de Computação, Unicamp, 2012. C. W. Krueger, “Easing the transition to software mass customization,” in Intl. Wksp. on Software Product-Family Engineering. Springer, 2002, pp. 282–293. K. Kang, M. Kim, J. Lee, and B. Kim, “Feature-oriented re-engineering of legacy systems into product line assets–a case study,” Software Product Lines, pp. 45–56, 2005. K. Berg, J. Bishop, and D. Muthig, “Tracing software product line variability: from problem to solution space,” in Proc. of the Conf. of the SAICSIT. Republic of South Africa: SAICSIT, 2005, pp. 182–191. BISTFaSC: An Approach To Embed Structural Testing Facilities Into Software Components Marcelo Medeiros Eler Paulo Cesar Masiero School of Arts, Sciences and Humanities University of Sao Paulo Sao Paulo – SP [email protected] Institute of Mathematics and Computer Science University of Sao Paulo Sao Carlos – SP [email protected] Abstract—Component-based applications can be composed by in-house or COTS (Commercial off-the-shelf) components. In many situations, reused components should be tested before their integration into an operational environment. Testing components is not an easy task because they are usually provided as black boxes and have low testability. Built-in Testing (BIT) is an approach devised to improve component testability by embedding testing facilities into software components usually to support specification-based testing. Such components are called testable components. There are situations, however, in which combining specification and program-based testing is desirable. This paper proposes a BIT technique designed to introduce testing facilities into software components at the provider side to support structural testing at the user side, even when the source code is unavailable. An implementation to generate testable components written in Java is also presented. The approach was firstly evaluated by an exploratory study conducted to transform COTS components into testable components. I. I NTRODUCTION Component-Based Software Development (CBSE) is a reuse-based approach that defines techniques to build systems by putting existing software components together. According to Szyperski, software components are units of composition with contractually specified interfaces and context dependencies [1]. A software component implements specific functionalities that may be shared by several applications. The functionalities provided by a component can be only used via operations exposed by its interfaces. Component-based applications can use in-house or COTS (Commercial off-theshelf) components. CBSE brings many benefits to software development and maintenance [2]. Components are assumed to reach a high level of quality assurance in a short period of time. Due to market pressure, applications composed by such components are expected to inherit this high level of quality [3]. Experience showed, however, that this assumption is not necessarily true in practice [4], [3], [5]. Beydeda and Weyuker state that the component provider might not be able to anticipate all possible application context and technical environment in which the component might be used [3], [4]. Thus, the quality assurance conducted by the component provider might not be effective. This scenario gives to component users the responsibility of testing the reused components before gluing them together. 105 Testing COTS, however, is not an easy task, because they present low testability. Testability has been defined as the degree to which a system or component facilitates the establishment of test criteria and the performance of tests to determine whether those criteria have been met [6]. Components present low testability because they are usually provided as a black-box and the source code is seldom available to conduct program-based testing [3]. Component users are thus forced to use only specification-based techniques. Moreover, component users also suffer from lack of information because relevant documentation and data to derive test cases and plans might not be available. Several Metadata and Built-In Testing (BIT) approaches have been proposed to mitigate the problem of lack of information and low testability. Metadata are intended to provide component users with relevant information to conduct and to evaluate testing activities. These information may be test scripts, the inner structure of the component or invocation sequence constraints, for example [7]. BIT approaches improve component testability by adding testing facilities at the provider side to support testing activities at the user side. Such facilities are usually operations to control and observe a state machine, to evaluate invariants, to validate contracts or sequence constraints, and to generate or execute test cases automatically [8], [9], [10], [11], [12], [13]. Metadata and BIT approaches indeed contributed to improve components testability, especially to support specification-based testing. However, there could be situations in which black-box testing of components is not deemed sufficient and combining implementation and specificationbased testing techniques is desirable. In fact, these two techniques are meant to find different types of failures and their combined application may provide higher confidence [14]. The main purpose of this paper is to present an overview of the approach called BISTFaSC (Built-In Structural Testing Facilities for Software Components) that was designed to improve components testability by embedding testing facilities into software components to support program-based testing techniques. Components with testing facilities are called testable components, as in traditional BIT techniques. They have probes inserted by instrumentation to record information about their execution (paths and data exercised). Tester components are associated with testable components to define the boundaries of a test session and to generate coverage analysis based on the information collected during a test session. BISTFaSC is a generic approach that can be applied to different technologies and platforms since it only defines guidelines to transform software components into testable components. However, to validate our approach, we also present an implementation to generate testable components written in Java. This implementation is used to validate the feasibility of the approach by means of an exploratory study. The exploratory study is used to present how testable components can be used at the user side during testing activities. This paper is organized as follows. Section II presents basic concepts of built-in and structural testing. Section III introduces the main concepts of BISTFaSC and Section IV shows its Java implementation. Section V presents an exploratory study conducted to validate and to understand the main concepts of the approach. Section VI discusses the related work and Section VII provides some concluding remarks and future directions. II. control flow from node 3 to node 5. The requirement (x,1,4) means that variable x is defined in node 1 and it is used in a computation in node 4. Test cases should be created to exercise all possible definition and use pairs. 1 public int calcFactorial(int N) 2{ 3 int x=N; //node 1 4 if (x<1) //node 1 5 return 1; //node 2 6 else 7 while (x>1) //node 3 8 N=N*(--x); //node 4 9 return N; //node 5 10 } Fig. 1. 1 2 3 4 5 Source code and the CFG of the operation calcFactorial TABLE I. Criterion All-nodes All-edges All-uses T EST REQUIREMENTS OF C A L CFA C T O R I A L Test requirements 1, 2, 3, 4, 5 (1,2), (1,3), (3,4), (3,5), (4,3) (N,1,5), (N,1,4), (N,4,5), (x,4,(3,5)), (x,4,(3,4)), (x,1,4), (x,1,(3,4)), (x,1,(3,5)), (x,1,(1,2)), (x,1,(1,3)) BACKGROUND A. Structural Testing Testing is the process of executing a program with the intent of finding faults [14]. Structural testing focuses on testing the structure of a program. Test cases are generated to exercise the internal logic considering instructions, paths and data. Test data is derived from the implementation according to criteria used to determine whether the program under test is completely tested [15], [14]. Three well known structural criteria are the following: all-nodes, all-edges and all-uses. The all-nodes and all-edges criteria consider the execution control of the program and they are known as control-flow criteria [16]. It is common to adopt a model called ControlFlow Graph (CFG) to represent the inner structure of the program to support the analysis of control-flow criteria. In this particular graph, each node represents a block of instructions without flow deviation and each edge represents a possible transition from a block to another. The all-nodes criterion requires that every node of the CFG be executed at least once, while the all-edges criterion requires the execution of every edge at least once. The all-uses criterion takes information about the program data flow. Rapps and Weyuker [17] proposed an extension of the CFG called Def-Use Graph (DUG) to add information related to variable usage. The classical all-uses criterion requires that every definition of a data object and its associated use be executed at least once. Structural testing criteria are used to derive test requirements that should be met by test cases execution. Examples of test requirements is presented in Figure 1, which shows an example of a Java method and its CFG associated. The structural test requirements of the method calcFactorial are presented in Table I. The all-nodes requirements define which blocks of instructions should be executed by the test cases. The all-edges requirements define which transitions from a node to another should be exercised at least once. In the all-uses requirements, (x,4,(3,5)) means that the variable x is defined in node 4 and used in a decision that takes the 106 After executing the test cases, a coverage analysis is performed to measure how many test requirements were satisfied, which indicates how much of the structure of the program was actually exercised during the test session. B. Built-In Testing (BIT) According to Harrold et. al [7], the lack of information regarding COTS brings many problems to the validation, to the maintenance and to the evolution of component-based applications. BIT is one of the approaches that stands out from the literature to handle the issue of lack of control and information in component testing. BIT is an approach created to improve the testability of software components based on the self-testing and defectdetection concepts of electronic components. The general idea is to introduce functionalities into the component to provide its users with better control and observation of its internal state [10], [11], [12]. A component developed under the BIT concepts can also contain test cases or the capability to generate test cases. Such components are commonly called testable components. Components without testing facilities are called regular components in the remainder of this paper. Interfaces and operations of regular components are called, respectively, regular interfaces and regular operations. When a regular component becomes a testable component, it has a regular interface with regular operations as well a testing interface with operations to support testing activities. Based on the concepts of BIT, an European group called Component+ designed a testing architecture composed by three components [8], [9], [10], [11], [12]: • Testable Component: it is the component under test which incorporates testing facilities. • Tester Component: implements or generates test cases to test the regular operations of the testable component. • Handler Component: it is used to throw and handle exceptions. This component is especially important in fault-tolerant systems. The testable components of the Component+ architecture have testing interfaces, whose operations control a state machine to support model based testing. A generic example of a testable component is presented in Figure 2. Component users can set component testers to testable components. Component testers execute test cases against testable components, evaluate autonomously its results and output a test summary [18]. IRegular Testable Component ITesting Tester Component testing facilities into software components at the provider side to support structural testing at the user side, but without revealing the source code of the component. Figure 3 shows an illustration of the approach. 1 - Develops Provider IRegular User 5 – Creates ITesting <<component>> Testable Component Testable Component <<RegularInterface>> operation1() operation2() IRegular Controls and Observes 4 - Develops <<TestingInterface>> setTester(Tester) invokeTester() setState(State) isInState() A Component+ testable component. A testable component can operate in two modes: in normal and in maintenance mode. The testing capabilities of the testable component are turned off in the normal mode and they are turned on in the maintenance mode. Atkinson and Gross [9] proposed a BIT method integrated with the KobrA approach [19] to validate contracts between components and their users during deployment time. Lima et. al [20] developed a model-driven platform to generate testers according to this method. Brenner et. al [21] developed an environment in which tasks can be set to activate tester components in many situations to perform testing activities during runtime. This environment can also react according to the test results. For example, the system can be shut down and components can be replaced. There are situations in which COTS have no BIT capabilities. Barbier et. al [22] created a library called BIT/J and a set of tools to allow users to introduce testing facilities into COTS developed in Java. BIT/J generates the testable and the tester components automatically. The testable component code must be changed to manually include states and possible transitions. The tester component code must also be changed to include test cases. Bruel et. al [23] proposed an evolution to the BIT/J library using aspect oriented programming. In summary, BIT approaches focus on providing support to specification-based testing, which is a natural alternative given the black box nature of software components. The approach presented in this paper, on the other hand, proposes facilities to support structural testing. III. Regular Component 3 – Turns into Fig. 3. Fig. 2. 2 - Instruments BISTFA SC: B UILT-I N S TRUCTURAL T ESTING FACILITIES FOR S OFTWARE C OMPONENTS Components are usually provided as a black box and source code is seldom available to users who cannot conduct programbased testing [3]. The BISTFaSC approach was devised to improve the testability of Component-Based Systems by adding 107 Test Driver Tester Component An illustration of the BISTFaSC approach The BISTFaSC approach was designed to be used by component providers since they should be interested in providing components with high testability to their users. Testability is an important quality indicator since its measurement leads to the prospect of facilitating and improving a test process [24], [25]. Then, providing components with high testability can represent an advantage in competition [3]. In BISTFaSC, component providers develop regular components and include structural testing facilities by instrumentation. Components with structural testing facilities are called testable components to comply with the BIT approaches found in the literature. The providers also develop tester components to control and observe testable components, according to the recommendations proposed by the Component+ architecture [9], [10], [11], [12]. Both testable and tester components are packed and made available to external users. Component users purchase a component with testing capabilities and develop a test driver to execute test cases against the testable component. The test driver uses the tester component to put the testable component in testing mode (as the maintenance mode in Component+) and execute the test cases to exercise the regular operations of the testable component. Then, the tester component is used again to put the testable component in regular mode and generate coverage analysis. BISTFaSC is a generic approach and may be possibly used to create testable components with any implementation technology and platform. This approach only provides guidelines to help providers creating testable components and users to use the available structural testing facilities. These guidelines are presented in details as follows. A. Guidelines To Create Testable Components At The Provider Side 1) Development of the Regular Component: this stage represents the regular component engineering activities conducted to develop components. Regular components, in this paper, are components that do not have any intended testing facilities to support testing activities at the user side. Component providers employ specific programming tools and languages to develop their regular components for specific target platforms or frameworks. They can also develop test cases to perform quality assurance activities. 2) Instrumentation of the Regular Component: the purpose of this activity is to modify the regular component to give it the capability to support the structural testing at the user side. This modification process is called instrumentation. Instrumented regular components are called testable components. Instrumentation is a technique in which probes are inserted into all the component’s code. Probes are instructions placed in specific locations of the code to log execution data. The implementation of the probes depends on which information must be collected during the component execution. The information to be collected depends on the structural testing criteria supported by the testable component. If the testable component provides coverage analysis only for the all-nodes and the all-edges criteria (control flow), for example, the probes must log information about execution paths. If the testable component also provides coverage analysis for the alluses criterion (data-flow), for example, the probes must also log data related to variable definition and usage. The data collected by the probes must be stored somewhere. A database or an XML file could be used, for example. The instrumentation process must also collect and store the test requirements of the component according to the criteria employed to implement the probes. The test requirements could be sent to a database or written in an XML file, for example. A standard format to express the test requirements of the component and the data collected from its execution must be defined. If, for example, the test requirement for the deviation flow from Node 12 to Node 17 is expressed as Node 12 -> Node 17, the probes must register this information using the same pattern when the deviation flow goes from Node 12 to Node 17 during the component execution. This is important because the coverage analysis will use the log generated by the probes to define which test requirements were satisfied and which were not. Probes are instructions that record data into files or databases (I/O operations), which may slow down the performance of the component. To avoid the overhead that may be brought by probes execution, BISTFaSC defines that testable components should operate in two modes: in regular and in testing mode. The probes should be turned off when the testable component is in regular mode and should be turned on when the testable component is in testing mode. In general, testable components operate in testing mode only when they are being executed in the context of a test session. The instrumentation process can be manually done by the providers or fully automated by a tool. Performing this process manually, however, brings many effort to providers and it is also error prone. Then, BISTFaSC recommends that this process is performed by a tool that should be implemented according to the target implementation. 3) Development Of The Tester Component: the instrumentation process collect test requirements and insert probes 108 into regular components that are transformed into testable components. This process, however, only prepares the testable component to generate data to support structural testing. The objective of this activity is to develop the tester component that controls and observes the testable component. The tester component interface must expose operations to define the boundaries of a test session (control) and to generate a coverage analysis report based on the information collected from a test session (observe). The boundaries of a test session can be defined by operations developed to start and to finish a test session. The testable component starts to operate in testing mode when the operation of the tester component to start a test session is called. Consequently, the probes of the testable components are turned on at this point and start to log execution data. The testable component must return to normal mode and the probes must be turned off when the operation of the tester component to finish the test session is invoked. The tester component also defines an operation to report the coverage analysis of a test session execution. When this operation is requested, the tester component use the log generated by the probes and the test requirements of the testable components to calculate the coverage measure. The coverage report may be presented in many ways. BISTFaSC suggest four coverage analysis profiles: • Operations: presents the coverage for all operation of each class of the component (considering an object oriented implementation). • Interface: presents the coverage for the operations of the component’s interface. • Classes: presents the coverage for each class of the component. • Component: presents the coverage for the whole component. The tester component may also expose other operations to provide users with more testing facilities, but it must at least expose operations to define the boundaries of a test session (control) and to perform coverage analysis (observation). It is important to notice that the operations to support the testing activities are not inserted into the testable component, but they are exposed by the tester component. The interface of the regular component remains the same and it still can be used only through the interface. 4) Packing The Testable And The Tester Component: the goal of this activity is to pack all resources related to the testable component. The component provider must pack the testable component and its libraries and resources along with the tester component and the resource (database or file, for example) used to record the test requirements of the testable component. All these assets must be packed together because the testable and the tester component will be executed at the user side. The tester component, for example, needs to access the test requirements to generate the coverage analysis report. B. Guidelines To Use Testable Components At The User Side Component users receive the package with the testable and the tester component along with all resources and libraries required. There is no difference between using a regular component or a testable component in a regular mode. Testable components in regular mode have no testing facility activated and the tester component is not required. transform regular components into testable components. Figure 5 presents a simplified architecture of the BITGen tool. The component user cannot use the testing facilities provided by testable components directly. The user can only invoke the regular operations of the testable component. The tester component must be called to put the testable component in testing mode before conducting structural testing activities. Figure 4 shows an illustration of this process. BITGen Tester Component Testable Component Fig. 5. User 1-startTestSession() 1.1-changeMode(“Testing”) 2-invokeOperations(...) 2.1-Log(...) 3-finishTestSession() 3.1-changeMode(“Regular”) 4-getCoverage(profile) COVERAGE Fig. 4. user An illustration of a test session conducted by a testable component The user calls an operation of the tester component to start a test session. The tester component access the testable component and change it to the testing mode. From this point on, every execution of the testable component will be logged by its probes. Then, the user executes a test set against the testable component, calling its regular operations. Next, the user invokes the tester component to finish the test session. The tester component returns the testable component to the regular mode. Finally, the user invokes the operation of the tester component to produce a coverage analysis report, which should be generated by the tester component. IV. JaBUTi's Instrumenter Components For the sake of simplicity, regular components are implemented in Java without considering any specific details of component platforms (such as EJB). Components are packed into JAR files along with resources and libraries required. The instrumentation process of this particular implementation inserts probes into regular components to collect control (all-nodes and all-edges) and data-flow (all-uses) data. Performing this process manually requires much effort and is error prone. Thus, we decided to develop a tool called BITGen to 109 The simplified architecture of the BITGen tool. BITGen receives a regular component packed within a JAR file and generates another JAR file containing its testable version. If BITGen receives a JAR file called Comp.jar, for example, it generates a package with the name Comp_BIT.jar. Figure 6 presents the model of the testable component package generated by BITGen. The package contains the tester (BITTester) and the testable component, an XML file for the test requirements, a class called CoverageMode and the CoverageAnalysis components of the JaBUTi tool. The CoverageAnalysis components of JaBUTi are used by BITTester to calculate the coverage and CoverageMode is used to format the coverage report according to the profiles suggested by BISTFaSC (see Section III-A3). <<interface>> BITTester <<interface>> Testable Component <<JaBUTi's component>> CoverageAnalysis <<class>> CoverageMode A JAVA I MPLEMENTATION OF BISTFA SC BISTFaSC is a generic approach and may be applied for any component implementation, technology or target platform. We validate the main concepts of the approach by means of an implementation of testable and tester components written in Java. We show, in this section, how the regular Java components are instrumented and how they are controlled and observed by the tester components. The use of testable components generated by this particular implementation is presented in the next section. JaBUTi's Coverage Analysis Components Fig. 6. <<XML file>> TestRequirements Model of the testable component package generated by BITGen. BITGen uses the instrumenter components of the JaBUTi (Java Bytecode Understanding and Testing) tool [26] to instrument the regular component. JaBUTi is a structural testing tool that implements intra-method control-flow and data-flow testing criteria for Java programs. The instrumentation is based on the Java bytecode and it is supported by BCEL (Byte Code Engineering Library). During instrumentation, the test requirements regarding the control and data-flow criteria are written to an XML file. The tester component associated to the testable component is automatically generated by BITGen. Figure 7 shows the interface of the tester component. The tester component is called BITTester and exposes operations to control (startTesting and stopTesting) and to observe (getCoverage) the testable component. <<interface>> BITTester BISTFaSC is intended to be used by component providers to produce testable components for their clients. In this implementation, however, component users can also transform Java COTS into testable components since BITGen instrumentation is based on Java bytecode and the source is not required. BITTester() void startTesting(sessionID) String startTesting() void stopTesting() String getCoverage(sessionID, covMode) String getCoverage(sessionID, covMode, className) Fig. 7. V. Interface of the tester component generated by BITGen. The operation String startTesting() initiates a test session and puts the testable component in the testing mode, i.e., it turns on the probes of the testable component. Control and data-flow data are collected by probes and recorded into a trace file when the testable component is executed in this mode. The return of this operation is an identifier automatically generated for the test session initiated. This identifier is also used to name the trace file generated during the testable component execution. If the test session is identified by ”1234”, for example, the trace file generated is the following: trace_1234.trc. The operation void startTesting(sessionID) has the same effect as the operation mentioned before, but it receives a session identifier instead of generating it. This operation is useful mainly when the tester wants to perform one coverage analysis for several test sessions recognized by the same identifier. In this case, all information regarding the testable component execution is stored into the same trace file. The operation void stopTesting() finishes a test session by turning off the probes. No information is recorded about the testable component execution from this moment on. E XPLORATORY S TUDY An exploratory study was conducted to investigate the feasibility of BISTFaSC and to understand the effects of the testable component on a test session executed at the user side. The investigation was performed considering the Java implementation of the approach and a component called XmlWriter was used. XmlWriter is an open source component that is publicly available in the component provider website1 . This component is used to output XML code and the user may layer other functionalities on top of the core writing, such as on the fly schema checking, date/number formatting, specific empty-element handling and pretty-printing. A. Instrumentation This study was performed from the point of view of a component user, but first we had to transform XmlWriter into a testable component using the BITGen tool. Figure 8 shows the graphical user interface of BITGen. The tool requires the name and the JAR file with the component to be instrumented, and a local path to write the testable package and to store the data collected by the probes (trace file). After pushing the Generate button, BITGen instrumented XmlWriter and generated a package called XMLWriter_BIT.jar. This package contains all classes, components and resources presented in Figure 6. The operation String getCoverage(sessionID, covMode) is used to achieve a coverage analysis of a test session identified by the parameter sessionID. The report is presented according to the parameter covMode (see Section III-A3). In this particular operation, there is no difference between the Operations and Interface coverage mode. The coverage will be presented for all operations of the component for both profiles. The operation String getCoverage(sessionID, covMode, className) also generates a coverage analysis. The difference from the previous operation is that a class name may be specified as an input parameter. A coverage analysis only for the specified class is generated when the Classes profile is used. A coverage analysis is generated only for the operations of the specified class as well when the Interface profile is employed. When one of the getCoverage operation is called, the tester component (BITTester) finds the trace file associated with the test session identifier and sends it to the CoverageAnalysis components of JaBUTi. This component uses the XML file that contains the test requirements of the testable component to calculate the coverage according to the implemented criteria. Finally, BITTester uses the class CoverageMode to format the coverage report according to the requested profile (Component, Classes, Interface or Operations). 110 Fig. 8. User Interface of BITGen Once the regular component is fully developed and packed along with its libraries and resources, the effort to transform it into a testable component is really low. Considering this implementation, the component provider only has to provide the information required by BITGen and the testable component is generated automatically. B. Test Session An Eclipse project was created to evaluate the testable version of XmlWriter. The package XmlWriter_BIT.jar was included into the library references of the project and a test scenario was created. The providers of XmlWriter 1 https://code.google.com/p/osjava/ also published a JUnit test set to test its operations. Instead of creating new test cases to XMLWriter, we used this test set, which is presented in Listing 1. The set up and a tear down method was included in this investigation to control and observe the testable component following the sequence diagram presented in Figure 4. In the set up phase the tester component is invoked to start a test session and in the tear down it is called to stop the test session and to get a coverage analysis. The setUp method runs only once before the test cases execution because of the annotation @BeforeClass. The tester component BITTester is instantiated in Line 8 and a test session is started in Line 9. The identifier of the test session is recorded by the variable sessionID. The tearDown method runs only once after the test cases execution because of the annotation @AfterClass. The test session is finished in Line 14 and the coverage report is requested in Line 16. In this case, the coverage was requested to be presented for the whole component profile (CoverageMode.COMPONENT). The coverage returned as a String and it was written in the output console of the application (line 17). Listing 1. 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 48 49 50 51 52 53 54 55 } r e t u r n ”<?xml v e r s i o n = \ ” 1 . 0 \ ” e n c o d i n g =\”UTF−8\”?>” + ”<!−−U n i t t e s t −−>” + ”<u n i t >” + ”< t e s t o r d e r =\”1\” l a n g u a g e =\” e n g l i s h \”/> ” + ”<a g a i n o r d e r =\”2\” l a n g u a g e =\” e n g l i s h \”>< a n d A g a i n / </ a g a i n>” + ”</ u n i t >” ; } Lines 20 to 55 show common JUnit test cases. These test cases were created by the component provider and reused in this investigation. Table II presents the coverage analysis obtained from the execution of the test set. The first column shows the testing criteria considered during the coverage analysis. The second column displays the amount of test requirements for a specific criterion considering the whole component, i.e., it is the sum of the number of test requirements generated for all operations of all component classes. The third column presents the amount of test requirements that was covered by the test session execution considering the whole component. The fourth column shows the coverage percentage, which is calculated by CovTReq over TReq. TABLE II. Test set of XmlWriter. public c l a s s XmlWriterTest{ C OVERAGE ANALYSIS OF X M L W R I T E R Criterion All-nodes All-edges All-uses private s t a t i c BITTester b i t T e s t e r ; private s t a t i c String testSessionID ; @BeforeClass public s t a t i c void setUp ( ) { b i t T e s t e r = new B I T T e s t e r ( ) ; testSessionID = bitTester . startTesting () ; } TReq 100 84 238 CovTReq 33 23 56 Coverage 33% 27% 23% The coverage achieved is low for all criteria. Listing 2 shows an excerpt of the code used to get the coverage report at different granularity levels (presentation profiles). In these cases, the tester can investigate which classes and/or operations are not being exercised satisfactorily. @AfterClass p u b l i c s t a t i c v o i d tearDown ( ) { bitTester . stopTesting () ; String coverage ; coverage= b i t T e s t e r . getCoverage ( testSessionID , CoverageMode .COMPONENT) ; 17 System . o u t . p r i n t l n ( c o v e r a g e ) ; 18 } 19 20 @Test 21 p u b l i c v o i d t e s t X m l W r i t e r 0 1 ( ) throws I O E x c e p t i o n { 22 S t r i n g W r i t e r sw = new S t r i n g W r i t e r ( ) ; 23 X m l W r i t e r xw = new S i m p l e X m l W r i t e r ( sw ) ; 24 xw . w r i t e E n t i t y ( ” u n i t ” ) ; 25 xw . e n d E n t i t y ( ) ; 26 xw . c l o s e ( ) ; 27 a s s e r t E q u a l s ( sw . t o S t r i n g ( ) , ”<u n i t />” ) ; 28 } 29 30 @Test 31 p u b l i c v o i d t e s t X m l W r i t e r 0 2 ( ) throws I O E x c e p t i o n { 32 S t r i n g W r i t e r sw = new S t r i n g W r i t e r ( ) ; 33 X m l W r i t e r xw = new S i m p l e X m l W r i t e r ( sw ) ; 34 sw = new S t r i n g W r i t e r ( ) ; 35 xw = new S i m p l e X m l W r i t e r ( sw ) ; 36 xw . w r i t e X m l V e r s i o n ( ” 1 . 0 ” , ”UTF−8” ) ; 37 xw . writeComment ( ” U n i t t e s t ” ) ; 38 xw . w r i t e E n t i t y ( ” u n i t ” ) ; 39 xw . w r i t e E n t i t y ( ” t e s t ” ) . w r i t e A t t r i b u t e ( ” o r d e r ” , ” 1 ” ) . writeAttribute ( ” language ” , ” english ” ) . endEntity ( ) ; 40 xw . w r i t e E n t i t y ( ” a g a i n ” ) . w r i t e A t t r i b u t e ( ” o r d e r ” , ” 2 ” ) . w r i t e A t t r i b u t e ( ” language ” , ” english ” ) . w r i t e E n t i t y ( ” andAgain ” ) . e n d E n t i t y ( ) . e n d E n t i t y ( ) ; 41 xw . e n d E n t i t y ( ) ; 42 xw . c l o s e ( ) ; 43 44 a s s e r t E q u a l s ( sw . t o S t r i n g ( ) , g e t T e s t 2 O u t p u t ( ) ) ; 45 } 46 47 p r i v a t e S t r i n g g e t T e s t 2 O u t p u t ( ) { Listing 2. Script used to get different presentations of the coverage. 01 c o v e r a g e = b i t T e s t e r . g e t C o v e r a g e ( t e s t S e s s i o n I D , CoverageMode . ALL CLASSES ) ; 02 System . o u t . p r i n t l n ( c o v e r a g e ) ; 03 04 c o v e r a g e = b i t T e s t e r . g e t C o v e r a g e ( t e s t S e s s i o n I D , CoverageMode . ALL OPERATIONS ) ; 05 System . o u t . p r i n t l n ( c o v e r a g e ) ; 06 07 c o v e r a g e = b i t T e s t e r . g e t C o v e r a g e ( t e s t S e s s i o n I D , CoverageMode . INTERFACE OPERATIONS , ” S i m p l e X m l W r i t e r ” ) ; 08 System . o u t . p r i n t l n ( c o v e r a g e ) ; Table III presents the coverage obtained for each class of the component (Lines 1 and 2 of Listing 2). The table shows the coverage percentage and the number of covered test requirements over the number of test requirements (CovTReq/TReq) for each class and criterion. This report shows that the coverage reached for the class XmlUtils is practically 0% for all criteria. XmlUtils is not exercised probably because SimpleXmlWriter uses only one of its operations. That is why the coverage of the whole component is too low, even when the coverage for SimpleXmlWriter and AbstractXmlWriter are reasonable. TABLE III. C OVERAGE ANALYSIS OF X M L W R I T E R Classes SimpleXmlWriter AbstractXmlWriter XmlUtils 111 All-nodes 72% (29/40) 50% (3/6) 1% (1/54) All-uses 70% (22/31) 100% (1/1) 0% (0/52) CLASSES All-edges 66% (56/84) 0% (0/154) Table IV presents the coverage obtained for each operation of SimpleXmlWriter (Lines 7 and 8 of Listing 2). This table shows only the coverage percentage for each operation and criterion. The sign "-" indicates the absence of test requirements for that criterion. This usually happens when the method has only one node. Then, there are no test requirements for edges and uses. The tester can use this report to see which operations were not exercised and which operations need to be extensively tested. TABLE IV. C OVERAGE ANALYSIS OF S I M P L E X M L W R I T E R OPERATIONS Operations close closeOpeningTag endEntity getDefaultNamespace getWriter openEntity setDefaultNamespace writeAttribute writeAttributes writeCData writeChunk writeComment writeEntity writeText writeXmlVersion All-nodes 66% 100% 87% 0% 0% 100% 0% 100% 100% 0% 100% 100% 66% 0% 80% All-uses 50% 100% 77% 0% 100% 100% 50% 50% All-edges 30% 100% 76% 0% 100% 100% 44% 55% C. Performance Overhead Analysis We performed an analysis to measure the performance overhead brought by the testable component probes. We measured the time it took to execute the test set presented in Listing 1 100 times. First, we tested the regular version of XmlWriter. Next, we executed the test set against XmlWriter_BIT in regular mode, i.e., the startTesting operation was not invoked then the probes were deactivated. Finally, we tested XmlWriter_BIT in testing mode, i.e., the startTesting operation was called before the probes were activated. Table V shows the results of this analysis. A NALYSIS OF THE PERFORMANCE OVERHEAD Component XmlWriter XmlWriter BIT (probes off) XmlWriter BIT (probes on) Time 44 ms 47 ms 460 ms The testable components generated by BITGen have their size enlarged by extra code and libraries. The component classes are extended by the probes inserted during instrumentation. Table VI shows a comparison between the size (in bytes) of XmlWriter classes before and after the instrumentation. TABLE VI. A NALYSIS OF THE SIZE OVERHEAD Classes SimpleXmlWriter AbstractXmlWriter XmlUtils All classes Size before 4531 1417 3739 9687 Size after 5515 1795 5027 12337 Overhead 22% 26% 34% 27% SimpleXmlWriter presents the smallest overhead of the Component because it has 15 short operations which does not require too many probes to be instrumented. AbstractXmlWriter has only 4 small operations which are usually calls to abstract or interface operations. The size overhead presented by this class is in the average considering all classes (27%). The component user has not spent much effort to use the testing facilities of the testable component to conduct structural testing activities. Considering this particular implementation and the JUnit framework, the user had to add only 12 extra lines (Lines 06 to 10 and Lines 12 to 18) into the test set code. TABLE V. D. Size Overhead Analysis Overhead 0% 7% 1045% The overhead brought by the testable component is minimal when the probes are off. When the probes are on, however, the component execution is 10 times slower. Even with this great overhead brought by the probes when they are on, we believe that it is not significant in general because the probes will be on only when the component is under test. The probes are turned off when the component is integrated in an operational environment and in this case the overhead is lower and have less significant impact on the overall performance. However, it may be critical to many real time systems. 112 XmlUtils presents the greatest overhead size of the component. XmlUtils is smaller and has less operations than SimpleXmlWriter, but its operations are bigger and have more control flow deviations and variable usage. This makes a difference regarding the test requirements generated for the structural testing criteria (see Table III). XmlUtils generates more test requirements than SimpleXmlWriter considering all criteria. The more test requirements are generated for a class the more probes are required to trace its execution. The size overhead is not so important when components are used in enterprise environments where space and memory are usually widely available. However, it may be crucial when it is embedded into devices with space and memory restrictions. Regarding size overhead, the major problem of the testable components generated by BITGen is the size of the libraries and components required to perform the coverage analysis. Coverage analysis requires components of the JaBUTi tool and libraries to manipulate XML data and Java bytecode. These libraries and components add at least 1MB to the testable component size. Again, it is not a significant overhead in enterprise environments, but it is relevant for restricted devices. The overhead problem brought by extra components and libraries, however, is not related to the generic approach BISTFaSC. The restriction is imposed by its Java implementation which could be improved by reducing the number of library dependencies. The alternative to overcome this situation is to use two versions of the component: a testable version to conduct testing activities and a regular version to embed into devices with restrictions. VI. R ELATED W ORK Several approaches have been proposed in the literature as an attempt to improve component testability and support testing activities on the user side. The already mentioned BIT approaches (see Section II-B) introduce testing facilities into software components to control state machines and execute/generate test cases, validate contracts and invocation sequences, for example [9], [10], [11], [22], [23], [12], [13], [20]. The BISTFaSC approach is similar to most of these approaches since it also introduces testable and tester components associated. Testable components have testing facilities that can be turned on and off. The difference is that BISTFaSC introduces testing facilities to support structural testing while the other approaches usually support specification-based testing. Testable components are intended to generated by the component providers in classical BIT approaches and also in BISTFaSC. When COTS are not equipped with testing facilities by their providers, however, component users can use the Java implementation of the approach to instrument Java COTS. This process is similar to the approach supported by the BIT/J library [22]. Teixeira et. al [27] proposed a tool based on JaBUTi [26] to support the structural testing of Java components. The component provider can instrument the component and create test cases for it. The component user receives the instrumented component that is packed with test cases, test case descriptions and coverage information. The user thus can use the tool to create test cases using the JUnit framework and get a coverage analysis on the test set execution. This coverage can be compared to the coverage reached by the provider during development time and test cases packed with the component can be reused. There are many differences between BISTFaSC and the tool proposed by Teixeira et. al [27], although they are meant to support structural testing of software components. BISTFaSC follows the concepts of BIT and defines a tester component to the testable component. Moreover, the testing facilities that support structural testing activities are embedded into the component code and libraries therefore no supporting tool is required. Eler et. al and Bartolini et. al proposed an approach to produce testable services with structural testing capabilities [28], [29]. Their approach is similar to BISTFaSC because they propose to transform regular services into testable services by instrumentation. In this case, however, the testable service has to implement the operations to define the boundaries of a test session and to get the coverage information. The coverage information is generated by a third party service instead of being calculated by the testable service itself. The testable components of BISTFaSC do not have their operations augmented and the testing facilities are implemented by the tester component associated. Moreover, the coverage information is internally calculated then no third party component is required. VII. components and the component user has only to add a few extra code into their test class to take advantage of the testing facilities provided by testable components. The performance overhead brought by the probes added during instrumentation is not significant when the probes are off (regular mode), but it may be significant for real time applications. The size overhead is also not relevant considering enterprise environments, but it may be significant if the target environment is a device with space and memory restrictions. We believe this approach can bring benefits to CBSE, since it allows users applying both specification and implementation based testing techniques. Combining these two testing techniques may provide higher confidence to component providers and users. Moreover, there are several software engineering approaches in which components are used as the building blocks of applications, such as object oriented frameworks, serviceoriented computing and software product lines. Testable components can contribute with the testing activities conducted for all of these type of applications. The coverage information alone cannot help testers to improve their test set to increase the coverage when it is low. It only gives a clue of how much of the component was exercised during a test session, which is valuable information itself. As future work, we intend to propose metadata to help testers to understand which test cases should be created to improve the coverage achieved, but without revealing the source code. The idea is to use test metadata as suggested by Harrold et. al [7] and used by Eler et. al [28] for testable services. We also intend to explore how structural testing facilities could be used to perform component monitoring and regression test case selection and reduction. Moreover, we would like to perform more rigorous evaluation of the approach with bigger components. We also want to evaluate the usability of the approach at the user and at the provider side. ACKNOWLEDGMENT The authors would like to thank the Brazilian funding agency CNPq for its support. R EFERENCES [1] [2] [3] C ONCLUSION This paper presented a BIT solution to introduce facilities into software components at the provider side to support structural testing activities at the user side. The approach is generic and it is called BISTFaSC. A Java implementation of the approach was presented and used to perform an exploratory study by generating Java testable components of third party components. The exploratory study showed that the component provider does not have much effort to generate testable 113 [4] [5] [6] C. Szyperski, Component Software, Beyond Object-Oriented Programming, ser. Component Software. Addison-Wesley, 2002. J. Cheesman and J. Daniels, UML Components: A simple process for specifying component-based software. Addison-Wesley, 2000. S. Beydeda, “Research in testing cots components - built-in testing approaches,” in Proceedings of the ACS/IEEE 2005 International Conference on Computer Systems and Applications, ser. AICCSA ’05. Washington, DC, USA: IEEE Computer Society, 2005, pp. 101–vii. [Online]. Available: http://dl.acm.org/citation.cfm?id=1249246.1249567 E. J. Weyuker, “Testing component-based software: A cautionary tale,” IEEE Software, vol. 15, no. 5, pp. 54–59, 1998. M. Bozkurt, M. Harman, and Y. Hassoun, “Testing and verification in service-oriented architecture: a survey,” Software Testing, Verification and Reliability, pp. n/a–n/a, 2012. [Online]. Available: http://dx.doi.org/10.1002/stvr.1470 IEEE, “IEEE Standard Glossary of Software Engineering Terminology,” Tech. Rep., 1990. [Online]. Available: http://dx.doi.org/10.1109/IEEESTD.1990.101064 [7] M. J. Harrold, A. Orso, D. Rosenblum, G. Rothermel, and M. L. Soffa, “Using component metadata to support the regression testing of component-based software,” Tech. Rep. GIT-CC-00-38, 2000. [20] [8] Y. Wang, G. King, and H. Wickburg, “A method for built-in tests in component-based software maintenance,” in Proceedings of the Third European Conference on Software Maintenance and Reengineering. Washington, DC, USA: IEEE Computer Society, 1999, p. 186. [21] [9] C. Atkinson and H. gerhard Gros, “Built-in contract testing in modeldriven, component-based development,” in In ICSR-7 Workshop on ComponentBased Development Processes, 2002. [10] Y. Wang and G. King, “A european COTS architecture with builtin tests,” in Proceedings of the 8th International Conference ObjectOriented Information Systems. London, UK: Springer-Verlag, 2002, pp. 336–347. [11] J. Hornstein and H. Edler, “Test reuse in cbse using built-in tests,” 2002. [12] H.-G. Gross, Component-Based Software Testing with UML. Springer, 2005. [13] L. C. Briand, Y. Labiche, and M. M. Sówka, “Automated, contract-based user testing of commercial-off-the-shelf components,” in Proceeding of the 28th international conference on Software engineering. New York, NY, USA: ACM, 2006, pp. 92–101. [14] G. J. Myers, C. Sandler, T. Badgett, and T. M. Thomas, The Art of Software Testing. John Wiley & Sons, Inc., Hoboken, New Jersey, 2004. [15] B. Beizer, Software testing techniques (2nd ed.). New York, NY, USA: Van Nostrand Reinhold Co., 1990. [16] H. Zhu, P. A. V. Hall, and J. H. R. May, “Software unit test coverage and adequacy,” ACM Computing Surveys (CSUR), vol. 29, no. 4, pp. 366–427, 1997. [17] S. Rapps and E. J. Weyuker, “Selecting software test data using data flow information,” IEEE Transactions on Software Engineering, vol. 11, no. 4, pp. 367–375, Apr. 1985. [18] S. Beydeda and V. Gruhn, “State of the art in testing components,” in International Conference on Quality Software. IEEE Computer. Society Press, 2003, pp. 146–153. [19] C. Atkinson, J. Bayer, and D. Muthig, “Component-based product line development: the kobra approach,” in Proceedings of the first conference on Software product lines : experience and research directions: experience and research directions. Norwell, MA, USA: 114 [22] [23] [24] [25] [26] [27] Kluwer Academic Publishers, 2000, pp. 289–309. [Online]. Available: http://dl.acm.org/citation.cfm?id=355461.357556 H. S. Lima, F. Ramalho, P. D. L. Machado, and E. L. Galdino, “Automatic generation of platform independent built-in contract testers. Simposio Brasileiro de Componentes, Arquiteturas e Reutilizacao de Software,” 2007. D. Brenner, C. Atkinson, B. Paech, R. Malaka, M. Merdes, and D. Suliman, “Reducing verification effort in component-based software engineering through built-in testing,” in Proceedings of the 10th IEEE International Enterprise Distributed Object Computing Conference. Washington, DC, USA: IEEE Computer Society, 2006, pp. 175–184. F. Barbier, N. Belloir, and J. M. Bruel, Chapter: Incorporation of Test Functionality into Software Componentes. In book: COTS-Based Software Systems. Springer, 2003. J.-M. Bruel, J. Araújo, A. Moreira, and A. Royer, “Using aspects to develop built-in tests for components,” in The 4th AOSD Modeling With UML Workshop, 2003. W. T. Tsai, J. Gao, X. Wei, and Y. Chen, “Testability of software in service-oriented architecture,” in Proceedings of the 30th Annual International Computer Software and Applications Conference. Washington, DC, USA: IEEE Computer Society, 2006, pp. 163–170. L. O’Brien, P. Merson, and L. Bass, “Quality attributes for serviceoriented architectures,” in Proceedings of the International Workshop on Systems Development in SOA Environments. Washington, DC, USA: IEEE Computer Society, 2007, p. 3. A. M. R. Vincenzi, M. E. Delamaro, J. C. Maldonado, and W. E. Wong, “Establishing structural testing criteria for java bytecode,” Software Practice & Experience, vol. 36, no. 14, pp. 1513–1541, 2006. V. S. Teixeira, M. E. Delamaro, and A. M. R. Vincenzi, “Fatesc - uma ferramenta de apoio ao teste estrutural de componentes,” in Sessão de ferramentas - XXI SIMPÓSIO BRASILEIRO DE ENGENHARIA DE SOFTWARE. New York, NY, USA: ACM Press, 2007b, pp. 7–12. [28] M. Eler, A. Bertolino, and P. Masiero, “More testable service compositions by test metadata,” in 6th IEEE International Symposium on Service Oriented System Engineering, vol. 1, no. 1. Washington, DC, USA: IEEE Computer Society, 2011, pp. 204 –213. [29] C. Bartolini, A. Bertolino, S. Elbaum, and E. Marchetti, “Bringing white-box testing to service oriented architectures through a service oriented approach,” Journal of Systems and Software, vol. 84, pp. 655– 668, April 2011. Using Thesaurus-Based Tag Clouds to Improve Test-Driven Code Search Otavio Augusto Lazzarini Lemos∗ , Adriano Carvalho∗ , Gustavo Konishi∗ , Joel Ossher† , Sushil Bajracharya‡ , Cristina Lopes† ∗ Science and Technology Department – Federal University of Sao Paulo at S. J. dos Campos – Brazil {otavio.lemos, konishi, adriano.carvalho}@unifesp.br † Donald Bren School of Information and Computer Sciences – University of California at Irvine {jossher, lopes}@ics.uci.edu ‡ Black Duck Software, Inc. [email protected] by Bajracharya et al. [1]. VMP states that the likelihood of two people choosing the same term for a familiar concept is only between 10-15% [5]. A way to circumvent this problem is to present different options to the user based on the initially chosen term. Similar keywords can be automatically investigated and presented visually to the user according to their frequency in the code base. Tag clouds are visual aids adequate for this context, because the size and color of the terms can be presented according to their relevance in the repository. In this paper, we propose the use of thesaurus-based tag clouds to improve TDCS. A thesaurus of the same language in which the code in a given repository is written is used to explore similar terms for a given initial keyword taken from the method name. Each of these terms are searched in the repository and the tag cloud is formed according to their frequency in the code base. The terms are also weighted and colored according to where they appear in the full qualified names of matching methods. The closer the terms are to the method name, the larger their weight. We implemented the proposed approach as an extension to CodeGenie [4], a Java- and Eclipse-based TDCS tool. Since CodeGenie is based on Sourcerer [6], an infrastructure with a code repository mostly in English, a thesaurus based on that language was used. To have an idea of the effectiveness of our approach, an exploratory study with example searches was conducted. Our initial investigation shows that the tag clouds can improve the result set, when the initially selected method name is replaced by the most frequent related term. We also noticed that our approach can be specially useful for non-native speakers of the language in which the repository is based. These users may initially select terms that are not commonly used to name the desired functions, thus reducing the possibility to retrieve good results. The remainder of this paper is structured as follows. Secion II presents backgound information about TDCS, CodeGenie, and Tag Clouds, and Section III presents our thesaurus-based tag clouds approach to TDCS. Section IV presents details about our implementation using CodeGe- Abstract—Test-driven code search (TDCS) is an approach to code search and reuse that uses test cases as inputs to form the search query. Together with the test cases that provide more semantics to the search task, keywords taken from class and method names are still required. Therefore, the effectiveness of the approach also relies on how good these keywords are, i.e., how frequently they are chosen by developers to name the desired functions. To help users choose adequate words in their query test cases, visual aids can be used. In this paper we propose thesaurus-based tag clouds to show developers terms that are more frequently used in the code repository to improve their search. Terms are generated by looking up words similar to the initial keywords on a thesaurus. Tag clouds are then formed based on the frequency in which these terms appear in the code base. Our approach was implemented with an English thesaurus as an extension to CodeGenie, a Java- and Eclipse-based TDCS tool. Our evaluation shows that the approach can help improve the number of returned results. We also noticed the visual aid can be especially useful for non-native speakers of the language in which the code repository is written. These users are frequently unaware of the most common terms used to name specific functionality in the code, in the given language. I. I NTRODUCTION The increasing availability of open source code in the Internet has made possible code reuse through searches made upon open source software repositories [1]. Although this type of reuse can be relatively effective, it generally relies mostly on keywords, regular expressions, and other more syntactic information about the function to be found. Test-driven code search (TDCS) was proposed as a form of code search and reuse that makes use of more semantic information available on test cases1 [4]. Although exploratory studies have shown that TDCS can be effective in the reuse of auxiliary functions, its success also relies on keywords extracted from test cases (e.g., the searched method name). If the user selects these names poorly, few results will be returned. This issue is related to the vocabulary mismatch problem (VMP) as discussed 1 Similar test-driven code search approaches were also proposed by other researchers [2, 3] 115 public class RomanTest { @Test public void testRoman1() { assertEquals("I", Util.roman(1)); } nie, and Section V presents exploratory evaluation of the approach. Section VI presents related work and, finally, Section VII concludes the paper. II. BACKGROUND @Test public void testRoman2() { assertEquals("II", Util.roman(2); } TDCS makes use of test cases to describe a desired feature to be searched, the same way test cases are used in Test-Driven Development [7]. Figure 1 shows the basic TDCS process. To describe a missing feature in a project, test cases are designed in the Integrated Development Environment (IDE). The search facility is triggered and a query is formed based on information available on the test cases. In the IDE, the developer can explore results by weaving, testing, and unweaving them. Also involved in this process is a program slicing service and a repository access service, which provides access to self-contained code pieces. Whenever the developer feels satisfied with a particular code result, it can be left woven into the project. Unweaving of a code result at any time can also be performed [4]. Figure 1. ... @Test public void testRoman6() { assertEquals("M", Util.roman(1000)); } } Figure 2. An important element of such systems are counters used to number sections, pages, etc. An Arabic to Roman function could be implemented to present counters as Roman numerals. Figure 2 presents sample test cases in JUnit for a static method implementation of the function. After implementing the JUnit test cases, the user triggers the CodeGenie search facility by right-clicking on the test class and selecting the CodeGenie Search menu option shown in Figure 3. TDCS process [4]. TDCS was implemented as an Eclipse2 plugin named CodeGenie. The Code Services side is provided by Sourcerer, a source code infrastructure that provides all support needed to perform TDCS. CodeGenie formulates queries that contain three parts: (1) keywords present in the full qualified name of the desired method; (2) return type of the method; and (3) parameter types of the method. For example, given a test case with the assertion assertEquals("trevni", Util.invert("invert")), CodeGenie formulates the following query: fqn contents:(util invert) m ret type contents:(String) m sig args sname:String The query above means: “look for a method that contains the terms ‘util’ and ‘invert’ somewhere in the full qualified name, returns a String, and receives a String as a parameter”. 1) Example: To show how TDCS is used in practice, we show an example of search conducted using CodeGenie [4]. Consider the development of a document editing system. 2 http://www.eclipse.org/ Partial JUnit test class for an Arabic to Roman function. Figure 3. CodeGenie search being triggered [4]. CodeGenie sends the query to Sourcerer which, in turn, returns code results. The keywords are initially formed by the method and class names. In the example, ‘util’ and ‘roman’ are the initial keywords. By default, every information on the test cases is used to generate the query, i.e., class name, method name, and method signature. Figure 4 shows CodeGenie’s Search View with the results related to the referred example. The developer can examine, weave, and test a result by right-clicking on it and selecting ‘Integrate Slice’ and ‘Test Slice’. Integrated results can be detached by selecting the - accessed in 06-27-2012. 116 Figure 4. CodeGenie Search View [4]. ‘Detach Slice’ option. When a result is integrated, it appears as ‘[currently integrated]’ in the Search View, and it can be tested using the test cases designed for the search. When a result is successful against the test cases, a green bullet at its right-side is used to mark it, and the test results are appended to the result label. Red bullets and yellow bullets are used for failing candidates and yet-to-be tested candidates. As results are tested, they are rearranged in the Search View, so that green ones appear first, yellow ones appear second, and red ones appear in the bottom. There is also a second-level ordering for green results according to the execution time of the test cases. The user can also preview the code of a result by using the CodeGenie Snippet Viewer [4]. Note that the query formed by CodeGenie also contains keywords that are important to its effectiveness. If the user ever selects them poorly, few results will be returned. To help obtaining good terms for keywords, visual aids such as tag clouds can be applied. earlier manifests itself as the gap between the situation and solution models [9]. Developers implement code using words that describe the solution model while users seeking to reuse code might use words that come from their situation models. In general purpose search engines such as Google, such mismatch is reduced by the abundance of linguistically rich documents. Information retrieval models work well on natural languages but tend to be ineffective with source code, which is linguistically sparse. Approaches such as TDCS take advantage of the richness of source code structure, but the VMP is still relevant because users also rely in the choice of good keywords. A way to reduce this problem is to apply visual aids that can present similar terms to the ones initially chosen by users before the search task takes place. As commented earlier, tag clouds are specially suited for this context since they can present similar words with visual richness based on the frequency in which these words appear in the code repository. There are several ways to explore analogous keywords when constructing the tag clouds, such as using similarity algorithms like the one proposed by Bajracharya et al. [1]. In this paper we use a simpler approach that makes use of thesauri. The idea behind the thesaurus-based tag cloud is to form the cloud from synonyms of a given initial term, assigning weights to the returned results according to their relevance in the code base. In TDCS the tag cloud is used to search all synonyms of a given keyword taken from the method name. For example, while searching for a method initially named sum that adds two numbers, the cloud would show the A. Tag Clouds Tag clouds are visual presentations of a set of terms in which text attributes such as size, weight, or color are used to represent features of the associated terms [8]. Tag clouds can also be used to navigate in a set of terms, emphasize information, and to show results of a search task. III. T HESAURUS -BASED TAG C LOUDS FOR TDCS Although initial evidence has shown that TDCS can be useful in the reuse of open source code [4], some problems still affect its effectiveness. As discussed by Bajracharya et al. [1], in code search and reuse, the VMP mentioned 117 words addition, aggregation, and calculation, according to the frequency in which each term appears in the repository. In this way, developers can visualize which terms are more common than others, and change their initial choices. Figure 5 shows the thesaurus-based tag cloud creation process, as implemented in CodeGenie (see Section IV). First, according to the TDCS approach, the user creates a test set for the desired functionality. Then, the name of the method to which the test cases are targeted is extracted to form the initial term. The synonyms of this term are looked up in the thesaurus base. Each of the returned synonyms is searched in the code repository – in our case, in Sourcerer – and information about their frequency is given to the tag cloud generator. The tag cloud is then generated and shown to the user, who has the option of changing the method name used in the test cases with a search and replace task. Finally, the user can rerun the search using the more adequate term. Initially, one might think that using synonyms to search for source code would introduce too much noise. However, since we make sure these terms are searched in a source code repository – more specifically in full qualified names of code entities –, the tag cloud will only present terms related to code. In this way, the thesaurus-based tag clouds allow “taking a peek” at the repository before running the actual search. Such a quick look makes the search more prone to return better results. region of Eclipse’s Editor, the user can access the Synonyms Editor, and insert or remove terms. Figure 6 shows the Synonyms Editor interface. IV. I MPLEMENTATION To create thesaurus-based tag clouds, we need a synonyms database. For that reason, we have set up a relational MySQL 5.13 database with a set of words and related synonyms. English was used because the majority of code available on Sourcerer is written in that language. However, it is important to note that other languages can be used, and switching between them is straightforward. Also note that other types of dictionaries can be used, such as domain-specific thesauri. For our implementation, a synonyms search application was developed separately from CodeGenie. This application has access to the thesaurus database and executes simple SQL queries that, through the use of servlets, makes the synonyms search available to CodeGenie. In this way, CodeGenie can search for synonyms of any term by simply accessing the servlet4 . Depending on the domain in which the developer is working on, other domain-specific terms can come up and be considered as a synonym of a given term. Therefore, we also implemented the functionality of inserting and removing synonyms to the initial thesaurus database, however, only allowing for the removal of words not originally in the thesaurus. That is, for security reasons, original words in the thesaurus cannot be removed. With a right-click on any Figure 6. Synonyms Editor. Once the synonyms of a term are gathered, to create the tag cloud we need to calculate the frequency in which they appear in Sourcerer’s code base. To calculate such frequency, we use Sourcerer’s search service, whose input is a query in the format: contents:(term). The contents field limits the search to packages, classes, and method names of the entities in the data base. Such query returns a set of all occurrences of the given term, allowing for a trivial calculation of its frequency. However, such approach could bring performance problems because each synonym term would have to be searched individually by the search service. Depending on the number of terms, the tag cloud generation would be inefficient because it would require too much communication time with the server. To cope with such problem, we used the OR operator supported by Lucene5 , a high-performance, full-featured text search engine library written in Java, which is used in Sourcerer. Such operator supports the inclusion of several terms in the query, resulting in a set containing all occurrences of all searched terms. The query format with the use of the OR operator is the following: contents:(term1 OR term2 OR term3 ). 3 http://www.mysql.com/ - accessed in 06-27-2012. newer version of CodeGenie with tag clouds makes use of WordNet [10] instead of the older thesaurus. 4A 5 http://lucene.apache.org/core/ 118 - accessed in 06-27-2012. initial term 3. search for synonyms CodeGenie 2. extract term 8. search using CodeGenie Test case alter test case 7. create test case 1. Thesaurus base 4. term2 term1 search terms on Sourcerer term5 Sourcerer Figure 5. Tag cloud generator 6. Thesaurus-based tag cloud creation process. Another characteristic that can be extracted from the results is the the position of the term in the entity’s full qualified name. It is clear that the more to the right the term appears, the more the probability of the entity to be related to the searched functionality. For instance, suppose the synonym terms we are looking for are sum and calculation, and the returned entities containing such names are br.com.calculationapp.utils.QuickSort.sort and br.com.sumapp.utils.Math.sum. It is clear that the first term is more likely to define the searched functionality, since it matches a method name in the repository. Therefore, such term should be enhanced in the cloud. To represent such characteristic in the cloud, we also used color intensity: the higher the weight of the term in the result set, the higher the intensity of green in its presentation in the cloud. The tag cloud interface was created using the Zest/Cloudio6 Java library, which supports the creation of clouds in different sizes and colors. Zest/Cloudio also supports dimensioning of the cloud and selection of terms. In the thesaurus tag cloud generator interface, the user can execute the following commands: (1) Replace a term: replaces the originally selected term with the new term and rerun the search on CodeGenie; (2) Create new tag cloud: based on the selected term, creates a new tag cloud, supporting the browsing through terms until finding the most adequate; and (3) Edit synonyms: lists the synonyms of 6 http://wiki.eclipse.org/Zest/Cloudio term3 generate cloud feed with frequency info 5. term6 term4 the original term, and makes it possible to add terms to or remove terms from the thesaurus. V. E VALUATION We have conducted a preliminary evaluation of the approach presented in this paper. In order to check whether using the thesaurus-based tag clouds would improve the result set, we have developed test cases to search for six different functions, according to the TDCS approach. The selected functions are auxiliary features commonly used in software projects. Two recent studies have shown that auxiliary functions – also called programming tasks – are frequently target by queries issued to a popular and large code search engine [11]; and that they are important for software projects in general [12]. A senior computer science undergraduate student with good Java knowledge developed the sample test sets7 . Since he is a non-native English speaker, his choices of keywords might not be the most adequate. Based on the test sets, we searched for the desired functions using CodeGenie. The number of returned results was recorded. Then, to compare the initial outcome to the results using our approach, we generated tag clouds based on the keywords extracted from the same test sets, and replaced the method names with the most relevant term in the clouds. We then executed the search again in CodeGenie, and recorded the results. 7 The - accessed in 06-27-2012. 119 third author of this paper. The searches that were made using CodeGenie considered the class and method names of the desired functionality to be reused. The selected class and method names were the following: Calculation.total, a function to add two integers; Array.arrange, a function to sort elements in an array; Array.separate, a function to split integer elements of an array; File.weave, a function to merge contents of two files; QueryUtil.same, a function to check query strings equality; and Util.invert, a function to reverse strings. Figure 7 shows the generated tag clouds. Based on these clouds, the initially selected terms were replaced in the following manner: total was replaced by sum (see Figure 7(a)); weave was replaced by merge (see Figure 7(b)); arrange was replaced by sort (see Figure 7(c)); separate was replaced by split (see Figure 7(d)); and same was replaced by equal (see Figure 7(e)). Table I R ESULTS OF THE EXPLORATORY EVALUATION . Function Calculation.total Array.arrange Array.separate File.weave QueryUtils.same Util.invert Avg. Legend: TC = Tag Cloud without TC 26 1 6 2 2 0 6.17 with TC 68 180 24 80 6 2 60 Difference 42 179 18 78 4 2 53.84 Table I the number of returned results for each case. Note that in all six samples the use of tag clouds enlarged significantly the number of returned results (almost tenfold, on average). A paired Wilcoxon signed-rank test revealed that the means are significantly different at 95% confidence level (p-value=0.01563). We applied the Wilcoxon test because a Shapiro-Wilk normality test on our data showed evidence that the results with the use of tag clouds do not follow a normal distribution (p-value=0.002674). The usefulness of our approach is particularly evident in the sixth example, where no results were returned before using the tag cloud, and two adequate results were returned afterwards. By analyzing the results, we noted that our approach increases recall, but not necessarily precision. However, we believe the TDCS process itself supports high precision, when we consider the use of test cases and the interface of the desired function in the queries (a previous evaluation shows evidence of this [4]: all candidates that passed the tests in the study were relevant). These two mechanisms filter out unwanted results: only candidates that match the desired interface and are successful against tests can be considered to implement the intended function. For instance, in the second example where 180 results were returned after the use of tag clouds, by executing the test cases we can exclude spurious results, and a single working candidate is enough for the success of the search. This does not mean we need to test all candidates, because experimenting with some will probably be sufficient to reach a working result (remember that for the actual search, besides running the tests, we also consider parameter and return types in the query, so it is unlikely to have the majority of returned results as irrelevant). We noticed the thesaurus-based tag clouds can be specially useful for non-native speakers. For instance, consider the examples shown in Table I. As commented earlier, an undergraduate student with good Java knowledge chose the terms used in the searches. He is a Brazilian Portuguese native speaker. It makes sense for him to choose the word “separate” to define the splitting of contents of an array because the most adequate word in Portuguese for this function would be separar. Also, invert makes much more sense than revert, because the Portuguese word inverter is the one used with the intended meaning, while reverter is more frequently used to mean revert in the sense of going back to a previous state. As Portuguese is a Latin-based language, the closest words in English more natural to be chosen are the likewise Latin-based “separate” and “invert”. However, the tag cloud helps choosing a more common English word used to define such functions, i.e., “split” and “reverse”. In any case, we believe the thesaurus-based tag clouds can be useful not only for non-native speakers, but for anyone using CodeGenie. It is clear that developers are not always aware of the most common term used to define a given function, even when the repository is written in a native language. This is specially the case when developers are unfamiliar with the domain they are currently working on. A. Threats to Validity The validity of our initial study is threatened by several factors. In particular, the study had only a single subject. However, we are currently running a larger survey with professional developers and computer science undergraduate students to gather realistic keywords that would be chosen by the users. Initial analysis of the results shows that several subjects do choose inadequate keywords similar to the ones mentioned before, corroborating our initial results. For instance, from the 27 subjects that already responded to our survey – among them 13 professional developers –, almost 50% (16) chose the term ‘invert’ instead of ‘revert’ for the string reversion function. VI. R ELATED W ORK Software reuse has been widely explored in software engineering since the 1960s [3, 16–18]. Several aspects of reusability make it a hard problem, including creating reusable assets, finding these assets, and adapting them to a new application [3]. Several approaches to solve this problem have been proposed, but only recent work explore code available in open source repositories. Next, we sample some 120 (a) total (b) weave (c) arrange (d) separate (e) same (f) invert Figure 7. Tag clouds for the sample terms. software reuse and query expansion proposals, comparing them to TDCS and the approach presented in this paper. Reiss [3] argues that most early reuse techniques did not succeed because they required either too little or too much specification of the desired feature. Signature or type matching used by themselves do not seem to be effective, although PARSEWeb shows that it can provide more interesting results in combination with textual search. Full semantic matching requires the specification of too much information, and is thus difficult to accomplish. TDCS uses test cases, which are generally easy to check and provide. In a recent work, Reiss [3] also incorporates the use of test cases and other types of low-level semantics specifications to source code search. He also implements various transformations to make available code work in the current user’s context. However, in his approach, there is no slicing facility and therefore only a single class can be retrieved and reused Code reuse: An approach similar to TDCS was proposed by Podgurski & Pierce [19]. Behavior Sampling (BS) is a retrieval technique that executes code candidates on a sample of inputs, and compares outputs to an oracle provided by the searcher. However, differently from TDCS, inputs for the desired functions are randomly generated and expected outputs have to be supplied by users. TDCS implements the the ability to retrieve code based on arbitrary tests, an extension to BS considered by Podgurski & Pierce. PARSEWeb [20] is a tool that combines static analysis, text-based searching, and input-output type checking for a more effective search. 121 at a time. Moreover, the presented implementation is a web application, which also requires code results to be copied and pasted into the workspace in an ad hoc way. CodeGenie has the advantage of being tightly integrated with the IDE: code candidates can be seamlessly woven to and unwoven from the workspace. Hummel et al. [2] also developed a tool similar to CodeGenie, named Code Conjurer. The tool is more proactive than CodeGenie, in the sense that code candidates are recommended to the user without the need of interaction. However, the slicing facility presented by CodeGenie is more fine grained – at method level –, and Code Conjurer does not deal with the vocabulary mismatch problem targeted by the tag cloud-based approach presented in this paper. In fact, the tool implements an automated adaptation engine, but it deals only with the interface specification (i.e., not with keywords). Modern CASE tools are bringing sophisticated search capabilities into the IDE, extending traditionally limited browsing and searching capabilities [21–26]. These tools vary in terms of the provided features, but some common ideas that are prevalent among them are the use of the developer’s current context to generate queries and the integration of ranking techniques for the search results. CodeGenie also shares such features, bringing the use of test cases to provide a more sophisticated solution. Tag clouds are also used in the Sourcerer API Search (SAS) [27]. The difference between SAS and CodeGenie is that the latter is tightly integrated with the IDE, and thus the user can access tag clouds directly from the development environment. Moreover, CodeGenie makes use of test cases to form queries, while SAS is mostly based on keywords. Another difference is that tag clouds generated by SAS are not based on thesauri, but on an API similarity algorithm. More recently, Yang and Tan [30] proposed an approach that identifies word relations in code, by leveraging the context of words in comments and code. The idea is to find words that are related in code, but not in English. For such pairs, lexicals like Wordnet cannot help. Their approach could also be applied to CodeGenie, to improve the dictionary used in the formation of tag clouds, with a secondary code-related synonyms database. In fact, we are currently working on an extension to our approach that incorporates such idea. Most of the query reformulation approaches that were proposed in the past focus on concept location within a project, and mainly to identify code that must be dealt with in a particular software engineering task. The difference between such approaches and ours is that our goal is to help finding code – more specifically, methods – inside large open source repositories, with the intent of reusing it. However, other sophisticated NLP techniques incorporated by them could also be explored in the generation of our tag clouds. For instance, morphology changes could be applied to search other morphological forms of a given term. This could improve the effectiveness of our thesaurus-based tag clouds. VII. C ONCLUSION AND F UTURE W ORK In this paper we have presented an approach that applies thesaurus-based tag clouds to improve TDCS. The tag clouds are formed by using an initial term extracted from the input test set, and generating synonyms from this term. Synonyms are then looked up in a code base, being presented in the tag cloud according to their frequency and weight in the repository. Initial evaluation has shown that the tag clouds can enlarge the set of returned candidates of a given search whose initial term would not be the most adequate. Future work includes generating tag clouds based on API similarity instead of synonyms. More support for non-native speakers to better explore TDCS is also an interesting line of investigation. Dictionaries could be used to translate the initial term to the equivalent word in the intended language. This type of support would allow users to improve their chances of finding relevant code. Another line of future work we are currently investigating is automatically expanding the code search queries. The idea shares the basic principle of the thesaurus-based tag clouds, but instead of generating the cloud, the query itself is automatically expanded with similar terms. Such approach would skip the tag cloud examination and changing of initially developed test cases. However, the tag cloud approach would still be relevant in cases where the developer wants to check for terms in the repository before creating the test cases for the search. As commented in Section V-A, we are also running a large survey to measure the impact of the mismatch vocabulary problem in the context of code search, particularly with Query expansion: The application of Natural-Language Processing (NLP) to code and concept search has been proposed by earlier approaches. For instance, Shepherd et al. [28] combines NLP and structural program analysis to locate and understand concerns in a software system. The basic difference from our approach is that the concern location is targeted towards code in a local project, not in a code repository. Moreover, Shepherd et al.’s approach [28] requires more interaction, since similar terms have to be chosen by the user iteratively to construct the expanded queries. CodeGenie supported by tag clouds only require the user to choose a more adequate term once to reissue the search. Gay et al. [29] have proposed an IR-based concept location approach that incorporates relevance feedback from the user to reformulate queries. The proposed approach, however, is not directed to code reuse, but for concept location within a given project. Moreover, such as Shepherd et al.’s approach, is requires more user interaction. 122 non-native English speakers. Such study will provide a basis to better undesrtand this problem and consequently improve our tag cloud and query expansion approaches. ACKNOWLEDGEMENTS The authors would like to thank FAPESP for financial support (Otavio Lemos, grant 2010/15540-2). R EFERENCES [1] S. K. Bajracharya, J. Ossher, and C. V. Lopes, “Leveraging usage similarity for effective retrieval of examples in code repositories,” in Proc. of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering, ser. FSE ’10, 2010, pp. 157–166. [2] O. Hummel, W. Janjic, and C. Atkinson, “Code conjurer: Pulling reusable software out of thin air,” IEEE Softw., vol. 25, no. 5, pp. 45–52, Sep. 2008. [3] S. P. Reiss, “Semantics-based code search,” in ICSE ’09: Proc. of the 2009 IEEE 31st International Conference on Software Engineering. Washington, DC, USA: IEEE Computer Society, 2009, pp. 243–253. [4] O. A. L. Lemos, S. Bajracharya, J. Ossher, P. C. Masiero, and C. Lopes, “A test-driven approach to code search and its application to the reuse of auxiliary functionality,” Inf. Softw. Technol., vol. 53, pp. 294–306, April 2011. [5] G. W. Furnas, T. K. Landauer, L. M. Gomez, and S. T. Dumais, “The vocabulary problem in human-system communication,” Commun. ACM, vol. 30, pp. 964–971, Nov 1987. [6] E. Linstead, S. Bajracharya, T. Ngo, P. Rigor, C. Lopes, and P. Baldi, “Sourcerer: mining and searching internet-scale software repositories,” Data Mining and Knowl. Discovery, vol. 18, pp. 300–336, 2009. [7] K. Beck, Test driven development: By example. Addison-Wesley Professional, 2002. [8] A. W. Rivadeneira, D. M. Gruen, M. J. Muller, and D. R. Millen, “Getting our head in the clouds: toward evaluation studies of tagclouds,” in Proc. of the SIGCHI conference on Human factors in computing systems. New York, NY, USA: ACM, 2007, pp. 995–998. [9] G. Fischer, S. Henninger, and D. Redmiles, “Cognitive tools for locating and comprehending software objects for reuse,” in Proc. of the 13th ICSE, ser. ICSE ’91. Los Alamitos, CA, USA: IEEE Computer Society Press, 1991, pp. 318–328. [10] G. A. Miller, “Wordnet: a lexical database for english,” Commun. ACM, vol. 38, no. 11, pp. 39–41, Nov. 1995. [11] S. K. Bajracharya and C. V. Lopes, “Analyzing and mining a code search engine usage log,” Empirical Softw. Engg., vol. 17, no. 4-5, pp. 424–466, Aug. 2012. [12] O. A. L. Lemos, F. C. Ferrari, F. F. Silveira, and A. Garcia, “Development of auxiliary functions: should you be agile? an empirical assessment of pair programming and test-first programming,” in Proceedings of the 2012 International Conference on Software Engineering, ser. ICSE 2012. Piscataway, NJ, USA: IEEE Press, 2012, pp. 529–539. [13] L. Briand, P. Devanbu, and W. L. Melo, “An investigation into coupling measures for C++,” in ICSE’97. ACM Press, 1997, pp. 412–421. [14] P. Mohagheghi, R. Conradi, O. M. Killi, and H. Schwarz, “An empirical study of software reuse vs. defect-density and stability,” in ICSE’04. IEEE, 2004, pp. 282–292. [15] R. Burrows, F. C. Ferrari, O. A. L. Lemos, A. Garcia, and F. Taiani, “The impact of coupling on the fault-proneness of aspect-oriented programs: An empirical study,” in Proceedings of the 2010 IEEE 21st International Symposium on Software Reliability Engineering, ser. ISSRE ’10. Washington, DC, USA: IEEE Computer Society, 2010, pp. 329–338. [16] M. D. McIlroy, “Mass produced software components,” in Proc. of NATO Softw. Eng. Conference, P. Naur and B. Randell, Eds., Garmisch, Germany, 1969, pp. 138–150. [17] C. W. Krueger, “Software reuse,” ACM Comput. Surv., vol. 24, no. 2, pp. 131–183, 1992. [18] H. Mili, F. Mili, and A. Mili, “Reusing software: Issues and research directions,” IEEE Trans. Softw. Eng., vol. 21, no. 6, pp. 528–562, 1995. [19] A. Podgurski and L. Pierce, “Retrieving reusable software by sampling behavior,” ACM Trans. Softw. Eng. Methodol., vol. 2, no. 3, pp. 286–303, 1993. [20] S. Thummalapenta and T. Xie, “Parseweb: a programmer assistant for reusing open source code on the web,” in ASE ’07: Proc. of the twenty-second IEEE/ACM international conference on Automated software engineering. New York, NY, USA: ACM, 2007, pp. 204–213. [21] R. Holmes and G. C. Murphy, “Using structural context to recommend source code examples,” in ICSE ’05: Proc. of the 27th international conference on Software engineering. New York, NY, USA: ACM Press, 2005, pp. 117–125. [22] D. Mandelin, L. Xu, R. Bodı́k, and D. Kimelman, “Jungloid mining: helping to navigate the api jungle,” in PLDI ’05: Proc. of the 2005 ACM SIGPLAN conference on Programming language design and implementation. New York, NY, USA: ACM Press, 2005, pp. 48–61. [23] R. Sindhgatta, “Using an information retrieval system to retrieve source code samples,” in ICSE, L. J. Osterweil, H. D. Rombach, and M. L. Soffa, Eds. ACM, 2006, pp. 905–908. 123 [24] D. Poshyvanyk, A. Marcus, and Y. Dong, “JIRiSS - an eclipse plug-in for source code exploration,” in ICPC ’06: Proc. of the 14th IEEE International Conference on Program Comprehension. Washington, DC, USA: IEEE Computer Society, 2006, pp. 252–255. [25] N. Sahavechaphan and K. Claypool, “Xsnippet: mining for sample code,” in OOPSLA ’06: Proc. of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications. New York, NY, USA: ACM Press, 2006, pp. 413–430. [26] L. L. N. da Silva Jr., T. N. de Oliveira Alexandre Plastino, and L. G. P. Murta, “Vertical code completion: Going beyond the current ctrl+space,” in Proc. of the 6th Brazilian Symposium on Software Components, Architectures and Reuse (SBCARS). IEEE Computer Society, 2012, pp. 81–90. [27] S. Bajracharya, J. Ossher, and C. Lopes, “Searching api usage examples in code repositories with sourcerer api search,” in Proc. of 2010 ICSE Workshop on Search-driven Development: Users, Infrastructure, Tools and Evaluation, ser. SUITE ’10. New York, NY, USA: ACM, 2010, pp. 5–8. [28] D. Shepherd, Z. P. Fry, E. Hill, L. Pollock, and K. Vijay-Shanker, “Using natural language program analysis to locate and understand action-oriented concerns,” in Proc. of the 6th international conference on Aspect-oriented software development, ser. AOSD ’07. New York, NY, USA: ACM, 2007, pp. 212–224. [29] G. Gay, S. Haiduc, A. Marcus, and T. Menzies, “On the use of relevance feedback in ir-based concept location.” in Proc. of the IEEE International Conference on Software Maintenance. IEEE, pp. 351–360. [30] J. Yang and L. Tan, “Inferring semantically related words from software context.” in Proc. of the 9th IEEE Working Conference on Mining Software Repositories. IEEE, 2012, pp. 161–170. 124 MTP: Model Transformation Profile Magalhães, A. P.; Andrade, A.; Maciel, R. P. Computer Science Department, Federal University of Bahia Salvador, Brazil {anapfmm, aline, ritasuzana}@dcc.ufba.br Abstract—Model Driven Development (MDD) is a software development approach that makes intensive use of models allowing not only reuse of code but also reuse of models produced along the entire software development life cycle. At the core of MDD is a transformation chain that transforms input models into output models until code generation enabling software development automation. However, despite the importance of the development automation provided by MDD, there is a lack of modeling strategies to support transformation specification: they are usually specified in natural language and implemented directly in code. As a result, it is difficult to adopt best practices of software development and enable reuse. This paper proposes a model transformation profile (MTP) suitable for the transformation specification since requirements elicitation. MTP provides specifications independent of platform that can be reused to generate code in different transformation languages. This profile is a specialization of UML to the transformation domain taking advantage of the large number of tools that supports this language widely used by industry and academy. Keywords—model transformation specification; model driven development; model transformation development I. INTRODUCTION Model Driven Development (MDD) is a software development approach that makes intensive use of models to represent systems at different levels of abstraction that can be reused to generate new applications or generate applications in many different platforms. Model Driven Architecture (MDA) [9] is a MDD framework proposed by OMG [14] which has three levels of models corresponding to different abstraction levels of the development: the first level corresponds to the requirement phase where the models are independent of computation; in the second level the models incorporate solutions from computation engineering in the design and are independent of the platform; and in the third level the models incorporate platform features to generate code. The life cycle of MDD/MDA is based on a transformation chain which is responsible for the conversion of models in order to preserve consistency among them. The models at high abstraction levels are transformed into models at lower abstraction levels until code. Transformations play an important role in MDD/MDA enabling process automation and reuse of models in different applications. The development of transformations requires good practices of software engineering such as modeling languages, development processes, tools and so on. In [2], the authors propose treating transformations as models. In this context, works have been proposed [6][3][5] covering some aspects of transformation development, such as specific modeling languages. Usually these works use notations that depend on the transformation implementation language and lack support for the entire transformation development life cycle. The scenario in actual practice is that transformations are usually specified in an ad-hoc way using natural language and are implemented directly in code [4], [12]. This is considered a bottleneck of the widespread adoption of MDD by industry [19]. Aiming to contribute in changing this scenario, we propose a MDA conceptual framework to model transformation model in different abstraction levels enabling its implementation in different platforms and its reuse. The framework comprises: (i) a UML transformation profile, named Model Transformation Profile (MTP), to be used as a transformation modeling language; and (ii) a MDA development process, named Transformation Development Process, which covers the entire transformation development life cycle from requirements until code. The MTP profile defines the required concepts to specify model transformations. MTP is also platform free, which means that the models can be transformed into code in many different transformation languages (e.g. QVT [11], ATL [1]). The generation of code in many platforms increases productivity and portability. MTP uses UML diagrams and OCL expressions in its specification. As a UML MTP profile can make the development of transformations accessible by non-experts in transformation domain because UML is widely known and used in industry and academy with many modeling tools available. The Transformation Development Process is specified using an instance of SPEM [10], the OMG Process Modeling Language (PML). It aggregates the advantages of MDA through the specification of transformation models in different abstraction levels, and in a semi-automatic way allowing the reuse of model transformations in other transformation developments. This paper details the MTP profile showing its syntax and semantic elements. This profile is the core of the framework because it is essential for the definition of the MDA Transformations Process, i.e. the tasks of the processes are defined according the MTP profile. This paper is organized as follows: Section 2 presents some concepts about model to model transformation; Section 3 presents an overview of the Conceptual Framework; Section 4 details the MTP profile; Section 5 shows the evaluation of the approach in an explanatory study; Section 6 presents the 125 related works; and finally Section 7 concludes this work and presents the next steps to be taken in this research. II. III. CONCEPTUAL FRAMEWORK This section presents the proposed framework to develop model transformations comprising: a MDA Transformation Development Process; and the MTP profile to support the process modeling tasks. MODEL TO MODEL TRANSFORMATIONS Transformations are classified in two types: model to model (M2M) transformations or model to text (M2T) transformations when they generate code as output from input models. This work covers M2M transformations. The framework enables the development of model to model (M2M) unidirectional transformations. It uses a standard modeling language that is suitable for the transformation domain and provides an MDA process that covers the entire software development life cycle for the generation of model transformations. A model transformation can be defined as a triple T = <MMs, MMt, R>, where MMs is a set of source metamodels, MMt is a set of target metamodels and R is a set of relationships between MMs and MMt. These metamodels comprise a set of elements and R relates elements from the source metamodel (MMs) to elements of the target metamodel (MMt). For example, in Fig. 1, the model transformation (Mttrans) comprises a set R that relates the elements of the source metamodels (MMs) to the elements of the target metamodels (MMt). A metamodel describes an application domain and is specified according to a metametamodel which is a metamodel description language. In Fig. 1, the source metamodel (MMs) and target metamodel (MMt) are described using the same metametamodel (MMM). Fig. 2 shows an overview of the transformation framework structured in packages. As can be seen, the SPEM 2.0 and UML 2.2 are metamodels instances of MOF metametamodel. In the same way, the Transformation Process package is an instance of SPEM 2.0 metamodel. It is defined using UML 2.2 diagrams (<<import>> stereotype) and comprises three subpackages: the Method Content and Process packages (according to SPEM) and the Transformation Chain package). The MTP package specializes UML (<<import>> stereotype) in a profile for the transformation domain. Additionally, the Transformation Process applies the MTP profile as a modeling language (<<apply>> stereotype) A model M conforms to a metamodel MM if M is syntactically correct with respect to MM and satisfies MM constraints. A model M is syntactically correct with respect to MM, if M is a proper instance of MM, that is, essentially the objects in M are instances of metaclasses in MM and the links in M are instances of associations in MM [3]. For example, in Fig. 1 the model transformation Mttrans receives the input model Ms that conforms to MMs and produces the output model Mt that conforms to MMt. The next subsection gives a brief description of the Transformation Process package. A. Transformation Process Package The Transformation Process package contains the definition of a MDA process to guide the development of transformations. MDA comprises four model levels: computational independent model (CIM); platform independent model (PIM); platform specific model (PSM); and code. In our framework, the CIM and PIM models are specified in UML 2.2 diagrams stereotyped by the MTP profile, while the PSM model is specified using the metamodel of the implementation language chosen by the developer to generate the transformation code. The conformance relationship is necessary but it is not sufficient to guarantee correctness of a model transformation. It is necessary for the target models to preserve the semantic of the source models. Therefore, a model transformation metamodel must ensure that its instantiated model transformation model satisfies some properties: syntactic correctness to guarantee that the generated models conform to the target metamodels; semantic correctness, to guarantee that the generated models preserve the semantics of the source models; completeness, to guarantee that the transformation is able to process any input model which conforms to the source metamodel. MMM conforms to conforms to conforms to MMs MMtrans conforms to Source Model Ms MMt conforms to from Mttrans conforms to to Target Model Mt Fig. 1. Example of model transformation structure (adapted from [13]). Fig. 2. Conceptual framework overview. 126 models from conceptual view to computational view; (iii) PIM2PSM to transform models from computational view to a specific transformation language metamodel (e.g. QVT and ATL); and PSM2Code that transforms the model instance of a transformation implementation language to code. According to SPEM, a process specification is divided into two dimensions: static concepts and dynamic concepts. The static dimension of the Transformation Process is defined in the Method Content package and in the Transformation Chain package and the dynamic dimension of the Transformation Process is defined in the Process package. IV. The Method Content specification contains reusable concepts that will be used to define processes. In our framework it comprises five disciplines: (i) Specification and Analysis, (ii) Design, (iii) Implementation, (iv) Verification and Validation, and (v) Automation. Each one of these disciplines has many tasks. For example, the Specification and Analysis discipline has the following tasks: functional requirement specification; non-functional requirement specification; requirement detail; define relations; and detail relation behavior. Each of these tasks are performed by roles, they comprise steps and consume or produce artifacts. For example, the transformation specifier role is responsible for the specification of functional requirements that produce two artifacts: a use case diagram and a class diagram. MTP PROFILE A UML profile is an extension mechanism to define a specific domain language using UML concepts and the proposed MTP Profile is an UML extension for the model transformation domain. Its main goal is provide a platform free visual language, suitable for a model transformation domain that might be used to develop model transformations at a high abstraction level (by CIM and PIM models). The MTP profile is at M2 level of OMG model layers enabling model transformations to be designed as models (instances of MTP). Model transformation models are used as input of a transformation process (e.g. the proposed transformation process introduced in section III) to enable the model transformation model code generation. According the MDA approach, the process is organized (dynamic dimension) in four phases: Transformation Conceptual View phase, corresponding to the CIM model; Transformation Computational View phase, corresponding to the PIM model; Transformation Platform View phase, for the PSM model; and Code phase, for Code model. Each phase may have some iteration which contains task Uses (selected tasks from the method content). For example, the main goal of the Transformation Conceptual View (CIM) phase is to define the transformation requirements that will be transformed in relations at the PIM phase. MTP profile comprises two packages: MTPspec, used to model the artifacts produced in the Transformation Conceptual View phase (CIM); and MTPdesign, used to model the artifacts produced in the Transformation Computational View phase. The metamodels of MTPspec and MTPdesign are illustrated in Fig. 3 and 8 respectively and the defined stereotypes that instantiate each UML metaclass is partially shown in Table 1. A set of OCL constraints were used to determine well-formedness criteria of MTP instantiated models. The following subsections detail the MTPspec and MTPdesign. The Transformation Chain package automates the proposed process to generate the transformations code at the end of the chain. Four transformations were specified: (i) CIM2CIM, to refine CIM model; (ii) CIM2PIM to transform Fig. 3. MTPspec metamodel. 127 A. MTPspec According to MTPspec (Fig. 3) a Model Transformation comprises Model and Requirement which are responsible for defining the transformation requirements. Each Model Transformation has a name and an objective description specified as attributes. The specification of a model transformation model requires the definition of the source and target metamodels (sourceMM and targetMM associations). Considering these metamodels, the model transformation model receives input models (inModel association) conformed to the source metamodel and produces output modes (outModel association) conformed to the target metamodel. A Requirement has a description and a type which indicates if it is a functional or a non functional (specified as attributes). A functional Requirement represents the mapping of the elements of the source models to elements of the target models of a transformation in a high abstraction level. Requirements might be refined in some other Requirements (refineReq association) and may also be composed of other Requirements (compriseBy association). Requirements may have Constraints defined in natural languages. The Model concept represents models, metamodels and metametamodels according to their Level (specified as an attribute) that indicates the OMG model layer in which they are defined (e.g. M3, M2, M1, M0). To guarantee the conformance among models of different levels (e.g. models of M1 level must conform to models of M2 level), some OCL constraints, were specified (Fig. 4). The MTPspec is used in the first phase of the MDA Transformation Process for requirements elicitation where three types of UML diagrams can be used: the use case diagram (Fig. 6) and the class diagram associated with the object diagram (Fig. 7). Adopting the use case diagram we can represent the transformation requirements and its composition and refinement. The class diagram is more expressive and also allows the specification of extra information in their attributes such as the requirement type (functional or non-functional). In our Transformation Process it is necessary to model the class diagram because this is the input of the transformation to generate the PIM model. However, we also have a transformation (that refines the use case model into a class model) for those who prefer developing requirement elicitation in use case diagrams. To illustrate the applicability of the MTP profile let us consider the classical example of transforming a class model to a database schema (that we call OO2RDBMS). Fig. 6 shows the requirements identified for the OO2RDBMS example of model transformation expressed in a use case diagram. The transformation OO2RDBMS (represented by the actor) performs four requirements (represented by the use cases): Map class model, to generate a schema from the class model; Map class, to indicate that classes will be mapped in tables; Map attribute, to create columns in the tables according to the class attributes; and Map associations, to create the table foreign keys. An OCL constraint (represented by a note) is also required to indicate that it is not possible to have two attributes with the same name in the same hierarchical class structure. It is specified in a note linked to the Map attribute requirement. The requirements elicitation using a class diagram are shown in Fig. 7A. Fig. 4. OCL constraint to guarantee the conformance among models of different levels. The class diagram (Fig. 7A) can be complemented with extra information using an object diagram (Fig. 7B). For example, the source and target metamodels and the input and output models are specified as objects linked to the Model Transformation. Additionally, to guarantee the conformance of input models and source metamodel and the conformance of output models and target metamodel an OCL constraint (named M1M2ConformanceElement shown in Fig. 5) was also specified. It assures syntax correctness between models and metamodels: all elements of a M1 model (e.g. the elements of the input and output models) must be an instance of elements of M2 model (e.g. elements of source and target metamodels). Fig. 5. OCL constraint to guarantee model instantiations. Fig. 6. Requirement specification of OO2RDBMS transformation in a use case diagram. 128 represented by a use case, a requirement, represented by a class. B. MTPdesign The MTPdesign metamodel is shown in Fig. 8. It contains the elements required for the design of the transformations: Model Transformation that has a set of source and target metamodels and input and output models instantiated from Model; Relations; and the Constraints, now associated to Relations and to Models. Fig. 7. Requirements specification of OO2RDBMS transformation in a class diagram (A) complementd by an object diagram (B). In our OO2RDBMS transformation, the sourceMM and targetMM are UML and RDBMS metamodels respectively and the transformation receives a class model as input (defined as the object model1 linked to the model transformation by the inModel) and generates a DBSchema as output (defined as the object model2 linked to model transformation by the outModel). In the class diagram we can also define if a Requirement is functional or non-functional specifying the type attribute. Each Model Transformation has a name and an objective. The fact that the Model Transformations are composed of (composedBy association) other transformations enable their reuse, i.e. we can reuse existing transformations to compose new ones. Models are composed of Elements and each Element has a set of Attributes (e.g. the Attribute name of the Element Class in the UML metamodel). In the class diagram shown in Fig. 7A the Map class requirement comprises (<<include>> association) the generation of a primary key for the table (Generate Identifier requirement). Similarly, the Map attribute requirement is refined in Map simple attribute and Map multivalored attribute (represented by the UML specialization association) according to each kind of attribute found in the class. As said before, by our transformation process (section III A) it is possible to automatically transform the requirements mapped in a use case diagram into a class diagram. It is a one to one mapping that generates for each requirement, Fig. 8. TGMMdesign metamodel. 129 A Relation has a description with a brief documentation about its purpose and an obligation attribute that indicates if the relation is automatically processed or not. If the Relation obligation is set to true, it will always be executed when the input model has an element that matches the input element of the Relation (e.g. matched rules in ATL). If the Relation obligation is set to false it will be executed only when invoked by another Relation. Relations may accept Parameters and can be defined by a Pattern or detailed in a set of RelationDetail. Our profile implements the patterns defined in [7] such as the mapping pattern of one to one Relations where attributes of the source element are copied to attributes of the target element. Using the RelationDetail, the user can specify how each attribute of the target elements will be filled (e.g. the attribute name of a target element can be settled by a specific string). As mentioned before, the defined MTP stereotypes were mapped to their instantiated UML metaclass and are partially shown in Table 1. TABLE I. Instead of adopting a pattern users may define their own Relation behavior using the object diagram. Fig. 10 shows the details of the definition of Model2Schema Relation using an object diagram. The execution of this Relation will set the attribute called name of the target element (targetAttrib) with the string “DataBaseExe”. As a result a Schema called “DataBaseExe” will be created from a class Model. STEREOTYPES AND THEIR INSTATIATED METACLASSES Stereotype Metaclass <<Model Transformation>> Actor, class, package <<Requirement>> Use case, class <<Relation>> Class, Component, acticity <<Model>> Class, attribute, package Model2Schema Relation uses the mapping pattern). The attribute obligation can also be specified to indicate if the Relation will be always executed (e.g. top relation in QVT). A <<Model Transformation>> stereotype, for example, can be used as an actor of the use case diagram, as a class in a class diagram and as a package grouping Relations. MTPdesign can be used with the following UML diagrams: class diagrams, to map Relations between source and target metamodels elements; object diagram, to detail each rule definition; component diagram, to define transformation composition and enable the creation of new transformations reusing existing ones; and activity diagram, when it is necessary to define the transformation control flow. Fig. 9 shows a class diagram used to map the Model Transformation Relations in our example (OO2RDBMS model transformation). Classes are structured in three packages. Two of them, UML and RDBMS, are stereotyped by <<Model>> and represent the source and target metamodels and their Elements (e.g. UML metamodel contains the Element Class and the Element Attribute). The third package (OO2RDBMS in the middle) is stereotyped by <<ModelTransformation>> and has the set of Relations that comprises the model transformation. Each Relation receives an Element from the source metamodel and produces one or more Elements for the target metamodel (indicated by the arrows). The adopted pattern can be specified as the Relation attribute appliedPattern (e.g. the Fig. 9. Relation definitions in a class diagram. Fig. 10. Object diagram detailing Model2Schema Relation. The control flow of the relations might also be specified using an activity diagram. Fig. 11 shows the control flow of the OO2RDBMS model transformation. The transformation begins executing Model2Schema Relation (we have already set the obligation attribute as true). After that, the Class2Table Relation is called (it can be compared with a post condition specification in QVT). Finally, there is no sequence for the other Relation executions. An MDD process has a transformation chain composed of many transformations (e.g. a MDA transformation chain may has a CIM2PIM transformation, a PIM2PSM transformation and a PSM2Code transformation). It is important to model the architecture of the transformation chain and define what the input and output models and the source and target metamodels of each one of these transformations are. MTP allows the use of a component diagram to specify the transformation chain structure as shown in Fig. 12. Each transformation is represented by a component (e.g. OO2RDBMS) and has two required interfaces that represent its inputs, one for the source Metamodel (e.g. UML) and the other for the input model (e.g. ClassModel), and provides one interface that represents the output of the transformation (e.g. RDBMS). Fig. 11. OO2RDBMS Relations control flow. 130 consists of the specification of the transformation chain that automates the middleware process. This transformation chain has already been specified in an ad hoc way (using natural language) and the transformations were implemented directly in code in ATL language. Consequently, there is no documentation except a brief description in natural language and the ATL code. Fig. 12. Transformation chain architecture. A component diagram should be used to define a transformation composition too. This allows reuse of existing transformations to develop new ones. After analyzing the existing brief description of the transformations, we specified them using the proposed MTP profile. The middleware process adopt some profiles (RMODP, BP and CCA) and its transformation chain comprises four transformations: a CIM to CIM transformation (named RM-ODP to BP that refines CIM models); a CIM to PIM transformation (named BP to CCA); a PIM to PSM transformation (named CCA to PSM); and a PSM to Code transformation, to generate code in the selected platform. Due to lack of space, this paper only shows the RM-ODP to BP transformation specification. The description of this specification is organized in the phases according to the Transformation Process briefly described in section III (A): Transformation Conceptual View phase, Transformation Computational View phase and Transformation Platform View phase. The diagrams constructed during the development process form a significant documentation of the transformation, but not only this, as models, they can be used to generate transformation in different transformation languages. After the PIM specification using the MTPdesig profile, these diagrams can be automatically / semi-automatically transformed into a specific transformation language such as QVT and ATL (the PSM models) and then generate the transformation code. V. MTP PROFILE EVALUATION In order to evaluate the MTP profile we specified the transformation chain of a real MDA process which has been used in some projects in our laboratory [17][18][20]. Within this scenario, we wanted to evaluate the profile expressiveness in terms of structure and behavior specifications along the transformation development life cycle. The evaluation was guided by following the questions: CIM –Transformation Conceptual View Phase. The main purpose of this phase is to capture the transformation requirements. Q1: The profile allows the transformation requirements definition? To help us in requirement elicitation we used an example of a system which was developed in our laboratory using the middleware process (Fig. 13). The objective of this software is to manage paper writing. In Fig. 13A a use case diagram models one of the user requirements called Design a Project, to allow the creation of new projects (there were also other requirements modeled in another use case diagram and it is not shown here). By this, the Primary Author performs the following actions: Form a group, Register the project, Define roles and Define Author. The RM-ODP2BP transformation was executed using this diagram as input and was transformed in a class diagram. This diagram contains a central class (representing the requirement) grouping other classes, one for each action (Fig. 13B). Additionally, other classes were created to represent the necessary input and output data. Q2: Can we specify models and metamodels that are used as input and output of each transformation along the entire chain? Q3: The profile allows the definition of the Relations and its structure and behavior? Q4: Can we model a transformation chain showing the involved transformation and its relationships? Evaluating a profile, such as MTP, is not an easy task due to the complexity of identifying an adequate transformation example that covers all the definitions. So we decided to divide the evaluation into phases to better identify and deal with problems as they emerged. First of all we used a classical transformation, the transformation of a class model to a RDBMS schema, partially shown in this paper. This example served to illustrate the MTP metamodels and it demonstrated the adherence of the defined concepts with the UML diagrams as it was possible to represent every phase of the transformation development life cycle with them. After that, we used a real MDA process in the second evaluation, illustrated in the next subsection. Based on this example we defined the RM-ODP to BP requirements shown in a use case diagram (Fig. 14). A. Evaluation Description To carry out the evaluation of the MTP profile we used the MDA process for specific middleware services proposed in [8], which encompass the specification and implementation of portable specific middleware services. In this process through a transformation chain system functional requirements are decomposed until middleware service code. We will refer to it in the rest of this paper as middleware process. The validation Fig. 13. Example of software developed with the Middleware Process. 131 Fig. 14. RM-ODP to BP functional requirements in use case. The actor represents the transformation and executes two main requirements: Map user requirements and Map Actions, which include the creation of the data definition model. We also observed that it is necessary to indicate that each user requirement should be described in a single use case diagram. It was represented in a note as a constraint in natural language. The use case diagram was transformed by our framework in a class diagram ( Fig. 15A) where we add the requirement type. An object diagram was also constructed to add extra information such as the input and output models and source and target metamodels. The class stereotyped by <<Model Transformation >> contains the description of the transformation. The object T1 comprises the identification of the selected metamodels and models. For example, the sourceMM was set to RM-ODP metamodel and the targetMM was set to BP metamodel. PIM – Transformation Computational View Phase. The main goal of this phase is to define the transformation Relation and model its structure and behavior. Based on the class diagram with the Requirements (modeled in CIM phase in Fig. 15) we use a CIM2PIM transformation to generate the initial version of a map relations class diagram (Fig. 16) that represents the transformation Relations. The map relation diagram is composed of three packages: a package for the RM-ODP metamodel (stereotyped as <<Model>>) and their elements (represented by classes stereotyped as <<Element>>); a package for the BP metamodel (stereotyped as <<Model>>) and their elements (represented by classes stereotyped as <<Element>>); and a package for the model transformation (stereotyped as <<Model Transformation>>) with a set of relationships (classes stereotyped as <<Relation>>), one for each requirement identified in the previews phase. Fig. 15. RM-ODP to BP requirements modeled in a class diagram (A) complemented by an object diagram (B). 132 Fig. 16. Map relation class diagram. From these packages we can relate <<Element>> (from RM-ODP metamodel) to <<Element>> (from BP metamodel) defining the transformation Relations. The adopted pattern is also defined as an attribute in each Relation, if necessary. For example, the MapUserRequirements Relation relates the Process Element (from RM-ODP metamodel) to a BusinessProcess Element (from BP metamodel) and uses the Refinement pattern. The MapAction Relation relates the Action Element (from the RM-ODP metamodel) to four elements of the BP metamodel (Activity, InputGroup, OutputGroup and ProcessFlowPort) and does not use a pattern, its behavior will be later defined. It is important to notice that in this diagram we model what is transformed in what, not how an element is transformed in other (except in cases where a pattern is used). For the Relation that does not use patterns it is necessary to define its behavior. The object diagram was used to specify the MapActions behavior (Fig. 17). For each created element of the output model we defined how the attribute name must be assigned (assignExp attribute). We modeled the complete transformation chain architecture in a component diagram, as shown in Fig. 18. The first transformation RM-ODP to BP (Fig. 18) receives as input a DomainRequirement model which is a Use Case diagram conformed to RM-ODP metamodel and generates as output a class model (BusinessProcess) conformed to BP metamodel. Fig. 17. MapRelation behavior in an object diagram. Related to the first and second questions, the requirements elicitation was satisfactory represented by the use case and class diagrams. It was possible to represent requirements composition and constraints when necessary.Besides that, input and output models and metamodels were well defined using classes attributes. Fig. 18. Transformation chain architecture. These models are the input of the BPtoCCA model transformation that generates as output a class model conformed to a CCA metamodel. These models are the input of the CCAtoPSM model transformation that generates as output a component model conformed to EJB metamodel. The MTP profile covers the transformation specification until this phase. After this, it is necessary to choose a transformation implementation language and transform the PIM model into a PSM model. PSM - Platform View Phase. The main goal of this phase is to add platform detail to the model transformation. We decided to implement this transformation in QVT. Our transformation process transforms the previous model in a QVT class model as illustrated in Fig. 19. As can be seen, the RM-ODP to BP (<<RelationalTransformation>> stereotype) contains two input models mm1, instance of RMODP (<<relationDomain>>), and mm2, instance of BP (<<relationDomain>> ). The Relation MapUserRequirements is represented by the stereotype <<Relation>>. We now add some information specific to the QVT platform such as the attributes isCheckable and isEnforceable in the <<RelationDomain>>. Once the specific model is complemented by the developer with the platform information it is ready to generate code. Some considerations about the validation. We had observed that MTP could be applied to many UML diagrams allowing different constructions (structural and behavioral) for this transformations scenario. Fig. 19. Part of the RM-ODP transformation instance of QVT. Related to the third question, transformation structure was modeled through the mapping of Relation in a class diagram and by the component diagram. Relation behavior was modeled using patterns and an object diagram, but we observe the need to specify some other patterns due to the complexity of the behavior definition using object diagrams. For example, to use the object diagram the developer must know all the metamodel elements and their attributes to define how the relation should be processed. The last question (fourth question) was about transformation chain architecture which was well defined using the component diagram. Through the use of components to represent each transformation it was possible to model their relations and input/output artifacts. However, we noticed the need for a diagram to orchestrate the transformation. The profile specification should be also improved by adding more details to eliminate some ambiguity in using the concepts because some concepts can be used in different UML elements (e.g. <<Model Transformation>> can be a package, a class or a component) and the process left this decision open. VI. RELATED WORKS There are some approaches to developing model transformations that address specific languages and notations supported by proprietary tools. They usually focus on specific development aspects such as design and formal specifications usually not covering all the phases presented in a system development life cycle. The MOF Query/View/Transformation (QVT), the OMG proposed approach, is the most well known initiative to specify and develop model transformation. QVT Specifications use the Relation language which can be represented diagrammatically in two ways: using the UML class diagram or using a transformation diagram [11]. Although the class diagram is widely used in industry, the complexity of the QVT metamodels makes the diagram verbose and difficult to understand. On the other hand, the transformation diagram brings new notation with no portability for UML tools. Additionally, QVT does not provide a process to guide the developers in language usage. A relevant work which can be compared to our profile is the family of languages, named transML [6] to specify model transformations. This family of languages contains diagrams for the entire development life cycle providing support for specification, analysis, design and code. However, the proposed diagrams use a UML heavy extension with new notations that makes it difficult to learn and to integrate with the existing UML tools usually used. MTP also covers the entire transformation development life cycle since requirements to design and proposes a UML profile that has been able to represent model transformation without the necessity of use heavy extension. 133 In the direction of formal specification, [5] proposes a visual, formal, declarative specification language to be used at the initial stages of the development phases (analysis and design) focusing on transformation correctness. Its purpose is to specify what the transformation has to do and the properties that the transformed models should satisfy, instead of implementing the transformation. The specification is used to automatically derive assertions that will be used to test the implementation. REFERENCES [1] [2] [3] [15] proposes a notation for transformation design. This approach focuses on design and does not cover other stages of the development life cycle such as requirement and analyze. [4] In [12], Trans-DV, a systematic approach to the development of model transformation specifications is proposed. It comprises a model transformation requirements model, for the specification of transformation requirements and metamodel elements; a model transformation specification model, to specify structural and behavioral features; and a formal template catalogue to facilitate the use of formal verifications by non-experts in formal methods. Each one of these models is developed in steps described to guide the developer. However, the main goal of it is not transformation implementation but to give support to the specification of formal methods in the transformation development. [5] In summary, there are notations and languages for modeling model transformation but most of them propose a new notation or use heavy UML extensions and they do not cover the necessities of specification of the entire transformation development life cycle. [6] [7] [8] [9] [10] [11] [12] [13] VII. CONCLUSIONS AND FUTURE WORKS In this paper we presented a UML profile (MTP) to develop model transformations at a high abstraction level which is part of a transformation framework that comprises an MDA process. This framework proposes a model transformation model that enables the instantiation of transformations in different platforms saving effort on the part of developers in the entire transformation software development phases. The proposed profile specifies the knowledge of the transformation in an abstract way that otherwise would only be embedded in code or in natural language. We are currently investigating how to improve the profile to provide some required transformation properties [16], for example semantic correctness and completeness and other facilities such as bidirectional transformation. Additionally, in order to carry out a more accurate validation, we intend to use the profile in an industrial case study. Besides that we are adapting our current MDA environment named MoDerNe [21] to support transformation process specification and enactment. ACKNOWLEDGMENT We would like to thanks Jandson S. Ribeiro Santos (supported by PROPCI/UFBA 01-2012-PIBIC) for the specification of the OCL constraints presented in this work. [14] ATL Project - http://www.eclipse.org/m2m/atl/ Bézivin, J., Buttner, F., Gagolla, M., Jouault, F., Kurtev, I., Lindow, A.Model Transformations? Transformation Models? MoDELS 2006, LNCS 4199, pp. 440–453, Springer-Verlag Berlin Heidelberg (2006) Braga, C., Menezes, R., Comicio, T., Santos, C., Landim, E.: On the Specification Verification and Implementation of Model Transformation with Transformation Contracts. SBMF – Simposio Brasileiro de Engenharia de Software (2011), pp. 108-123, ISBN: 978-3-642-25031-6, ACM, São Paulo, Brazil (2011). Guelfi, N. et al.: DRIP Catalyst: An MDE/MDA Method for Faulttolerant Distributed Software Families Development. In: OOPSLA Workshop on Best Practices for Model Driven Software Development. Canada. (2003) Guerra, E.; Lara, J.; Kolovos, D.; Paige, R. A Visual Specification Language for Model-to-Model Transformations. IEEE Symposium on Visual Languages and Human-Centric Computing, DOI 10.119/VLHCC.2010.25, 2010. (2010) Guerra, E.; Lara, J.; Kolovos, D.; Paige, R.; Santos O. TransML: A Family of Languages to Model Model Transformations. Models, 2010, DOI 10.1007/s10270-011-0211-2, Springer Verlag, . (2010) Iacob, M., Steen, M., Heerink, L.: Reusable Model Transformation Pattern. In 3M4EC´08, pages 1-10 (2008) Maciel, R.., Silva, B. C. Mascarenhas, L.A..:An Edoc-based Approach for Specific Middleware Services Development. In: 4th Workshop on MBD of Computer Based Systems, pp.135–143, Proc. IEEE Press, Postdam, Germany. (2006) OMG. MDA Guide. Version 1.0.1 (omg/2003-06-01) (2003) OMG. Software Process Engineering Metamodel Specification, Version 2.0, (formal/08-04-01).(2008) QVT specification - http://www.omg.org/spec/QVT/1.0/PDF/ Sani, A.; Polack, F.; Paige, R. Model Transformation Specification for Automated Formal Verification. 5th Malaysian Conference in Software Engineering, IEEE, 2011.(2011) Mellor, S.; Clark, A.; Futagami, T. Model Driven Development. IEEE Software, ISBN 0 7 4 0 - 7 4 5 9 / 0 3, 2003 OMG Object Management Group – www.ogm.org [15] Rahim, L.; Mansoor, S. Proposed Design Notation for Model Transformation. 19th Australian Conference on Software Engineering, 2008. [16] Lano, K.; Clark, D. Model Transformation Specification and Verification. The Eighth International Conference on Quality Software, IEEE, 2008. [17] Maciel, R. S. P. ; GOMES, R. A. ; MAGALHÃES, A. P. ; Silva, Bruno Carreiro da ; QUEIROZ, J. P. B. . Supporting model-driven development using a process-centered software engineering environment. Automated Software Engineering v. 1, p. 1, 2013. [18] Bispo, C. P. ; Maciel, R. S. P. ; DAVID, J. M. N. ; RIBEIRO, I. ; CONCEICAO, R. . Applying a Model-Driven Process for a Collaborative Service-Oriented Architecture. In: International Conference on Computer Supported Cooperative Work in Design (CSCWD), 2010. [19] Hutchinson, J.; Whittle, J.; Rouncefield, M. Kristoffersen, S. Empirical Assessmet of MDE in Industry. ICSE´11, Waikiki, Honolulu, HI, USA, 2011. [20] Maciel, R. S. P. ; FERRAZ, Carlos Guimaraes ; ROSA, N. S . An MDA Domain Specific Architecture to Provide Interoperability Among Collaborative Environments. In: 19 Simpósio Brasileiro de Engenharia de Software, 2005. [21] MACIEL, RITA SUZANA PITANGUEIRA ; GOMES, RAMON ARAÚJO ; MAGALHÃES, ANA PATRÍCIA ; SILVA, BRUNO C. ; QUEIROZ, JOÃO PEDRO B. . Supporting model-driven development using a process-centered software engineering environment. Automated Software Engineering v. 1, p. 1, 2013. 134 A Model-Driven Infrastructure for Developing Product Line Architectures Using CVL Amanda S. Nascimento Cecı́lia M.F. Rubira Rachel Burrows Fernando Castor Institute of Computing Institute of Computing Department of Computer Science Informatics Center University of Campinas University of Campinas University of Bath Federal University of Pernambuco Campinas, SP, Brazil Campinas, SP, Brazil Bath, UK Recife, PE, Brazil [email protected] [email protected] [email protected] [email protected] Abstract—Over recent years, software engineers have been evaluating the benefits of combining Software Product Line and Model-Driven Engineering, which is referred to as ModelDriven Product Line Engineering (MD-PLE), to achieve software architecture evolvability. In general terms, existing solutions for MD-PLE support high-level model refinement into lower level abstractions in order to reach code implementation of product line architectures. Nevertheless, the applicability of such approaches is limited due to either the unsystematic processes that do not clearly specify how to refine models to decrease the level of abstraction, or the lack of automation support. In this paper, we propose an infrastructure for MD-PLE. The infrastructure encompasses a model-driven, systematic and semiautomated engineering method that combines a set of existing process, tools, languages and models to implement product line architectures. Common Variability Language (CVL) is adopted to specify and resolve architectural variability explicitly and systematically. We employ our solution to develop a family of software fault tolerance techniques for service-oriented architectures. The results obtained suggest the proposed solution is useful and efficient to implement product line architectures. We report lessons learned from this case study and present directions for future work. Keywords-Software Product Lines; Model-driven Method; Common Variability Language; I. I NTRODUCTION Currently, there is an increasing need to address software architecture evolvability in order to gradually cope with new stakeholders’ needs. At the same time, the software system’s desired time-to-market is ever decreasing. This reality demands on software architectures’ capability of rapid modification and enhancement to achieve cost-effective software evolution. To face these needs, advanced software paradigms have emerged. For instance, Model Driven Engineering (MDE) and Software Product Line Engineering (SPLE) are two promising cases in particular as they increase software reuse at the architectural level of design [1], [2], [3]. MDE conceives software development as transformation chains where higher level models are transformed into lower level models [4]. In MDE, reuse is mainly achieved by means of model transformations, which are built once but can be enacted for a variety of input models that yield different results [4]. Models are considered as first-class entities, are used to capture every important aspect of a software system for a given purpose, and are not just auxiliary documentation artefacts; rather, they are source artefacts and can be used for automated analysis and/or code generation [2], [5]. SPLE systematises the reuse of software artefacts through the exploration of commonalities and variabilities among similar products [6], [7]. Feature modelling is a technique for representing the commonalities and the variabilities in concise, taxonomic form [8], [2]. A key to the success of SPL implementation is structuring its commonalities and variabilities in Product Line Architectures (PLAs) in terms of variable architectural elements, and their respective interfaces, which are associated with variants [6]. Furthermore, SPLE encompasses two sub-processes [6]: (i) domain engineering aims to define and realize the commonality and the variability of the SPL; and (ii) application engineering aims to derive specific applications by exploiting the variability of the SPL. The combination of MDE and SPLE, referred to as ModelDriven Product Line Engineering (MD-PLE), integrates the advantages of both [1]. SPLE provides a well-defined application scope, which enables the development and selection of appropriate modelling languages [2]. On the other hand, MDE supports the different facets of a product line more abstractly. For instance, architectural variability can be specified systematically by appropriate models, thus enabling (i) an automated product derivation process in accordance with feature configurations, also called resolution models [9]; and (ii) an automated generation of source code to implement PLA models [10], [2]. Model-driven engineering method to develop SPLs should describe systematically how to refine models to decrease the level of abstraction to reach a code implementation. Nevertheless, existing solutions for MD-PLE either do not define clearly the sequence of models to be developed at each SPLE sub-process and how models transformations are performed between the different process phases [10], [1], or, if they do, they lack of automation support [11], [5]. In this sense, we propose a model-driven infrastructure to implement PLAs through a step-by-step refinement from highlevel models into lower level abstractions. The infrastructure employs a set of existing process, tools and models, such as model transformations are semi-automated. More specifically, to specify common and variable requirements of the product line we adopt use case models, as suggested by Go- 135 maa [12], and feature models [8]. Second, the FArM (FeatureArchitecture Mapping [8]) method is adopted to perform a sequence of transformations on the initial feature model to map features to a PLA. Third, we explicitly and systematically specify architectural variability such that executing the variability specification model with a resolution model will yield a product model. In particular, our solution is based on the Common Variability Language (CVL), a generic variability modelling language being considered for standardization at the Object Management Group (OMG) [9]. Finally, in accordance with the PLA model, the source code generation is guided by component-based development process and uses COSMOS*, a component implementation model that materialises the elements of a software architecture using the concepts available in object-oriented programming languages [13]. We have identified a set of supporting tools to specify and validate the target models and automate model-to-model transformations. We show the feasibility and effectiveness of the proposed solution by employing it to implement a PLA to support a family of software fault tolerance techniques applied to service-oriented architectures (SOAs). The resulting models are described in detail and lessons learned from executing this case study are also presented. Contributions of this work are twofold (i) a model-driven, systematic and semiautomated engineering method to develop PLAs; and (ii) to promote the incorporation of CVL and its supporting tools in a comprehensive engineering method for PLA implementation ensuring that CVL models are effectively coordinated. II. BACKGROUND We present an overview of the main concepts, definition, models, methods and languages encompassed by our solution. A. Cosmos* Implementation Model The COSMOS* model [13] is a generic and platformindependent implementation model, which uses objectoriented structures, such as interfaces, classes and packages, to implement component-based software architectures. The main advantages of COSMOS*, when compared with other component models such as Corba Component Model (CCM), Enterprise Java Beans, and .NET, is threefold. COSMOS* provides traceability between the software architecture and the respective source code by explicitly representing architectural units, such as components, connectors and configuration. Second, COSMOS* is based on a set of design patterns, thus, it is considered a platform-independent model. Thirdly, the set of design patterns employed by COSMOS* can be automatically translated to source code. In particular, our solution employs Bellatrix [14], an Eclipse Plug-in that translates graphically specified software architectures (using Unified Modelling Language - UML) to Java source code in COSMOS*. Bellatrix allows the creation of a ‘skeletal’ system in which the communication paths are exercised but which at first has a minimal functionality. This ‘skeletal’ system can then be used to implement the system incrementally, easing the integration and testing efforts [15]. To address different perspectives of component-based systems, COSMOS* defines five sub-models: (i) a specification model specifies the components; (ii) an implementation model explicitly separates the provided and required interfaces from the implementation; (iii) a connector model specifies the link between components using connectors, thus enabling two or more components to be connected in a configuration; (in) a composite components model specifies high-granularity components, which are composed by other COSMOS* components; and (v) a system model defines a software component which can be executed, thus encapsulating the necessary dependencies. Figure 1 (a) shows an architectural view of a COSMOS* component called CompA and Figure 1 (b) shows the detailed design of the same COSMOS* component [13]. We have used UML notation in this paper to model architecture and detailed design and, in Figure1 (a), we have omitted operations and attributes for the sake of clarity. The component is internally divided into specification (CompA.specpackage) and implementation (CompA.implpackage). The specification is the external view of the component, which is also sub-divided into two parts: one that specifies the provided services (CompA.spec.prov package) and the other makes dependencies explicit (CompA.spec.req package). For instance, IM anager and IA are provided interfaces and IB is a required interface. The COSMOS* implementation model also defines classes to support component instantiation and to implement provided interfaces. The CompA.impl package has three mandatory classes: (i) a ComponentF actory class, responsible for instantiating the component; (ii) a F acade class that realizes provided interfaces, following the Facade design pattern; and (iii) a M anager class that realizes IM anager interface and provides meta-information about the component. ClassA and ClassB are examples of auxiliary classes. Fig. 1. (a) An architectural view and a detailed design of a COSMOS* component; (b) A detailed design of the COSMOS* component B. Software Product Lines (SPLs) SPL engineering aims to improve development efficiency for a family of software systems in a given domain [6]. 1) Use Case Modelling for Software Product Lines: To specify the functional requirements of a SPL, it is important to 136 capture the requirements that are common to all members of the family as well as the variable ones. Gomaa [12] identifies different kinds of use cases: kernel use cases, which are needed by all members of the product line; optional use cases, which are needed by only some members of the product line; and alternative use cases, which are usually mutually exclusive and where different versions of the use case are required by different members of the product line. We refer to Gomaa [12] for further details on how to identify and specify use cases. 2) Feature Models: A feature is a system property that is relevant to some stakeholder [16]. A feature model represents the commonalities among all products of a product line as mandatory features, while variabilities among products are represented as variable features. Variable features extensively fall into three categories: (i) optional, which may or may not be present in a product; (ii) alternative, which indicates a set of features, from which only one must be present in a product; and (iii) multiple features, which represents a set of features, from which at least one must be present in a product [17], [3]. The feature model can also represent additional constraints between features. Some examples of constraints are mutual dependency, when a feature requires another, and mutual exclusion, when a feature excludes another [17], [3]. 3) Product Line Architectures (PLAs): A PLA is a key artefact to achieve a controlled evolution and it should (i) be consistent with the feature model; and (ii) represent the common and variable parts in a product line [6]. In PLAs, software variability can be reached by delaying certain architectural design decisions, which are described through variation points. A variation point is the place at the software architecture where a design decision can be made. Variants are the different possibilities that exist to satisfy a variation point. 4) Feature-Architecture Mapping (FArM): By iteratively refining the initial feature model, the FArM method enables the construction of a transformed feature model containing exclusively functional features, whose business logic can be implemented into architectural components [8]. It is largely accepted that FArM improves maintainability and flexibility of PLAs [8]. The FArM transformation flow is shown in Figure 2 and is briefly described in the following. Firstly, non-architecture-related features should be removed from the initial FM. Quality features, in turn, can be resolved by integrating them into existing functional features by enhancing their specification or by adding new features providing functional solutions that fulfil the quality features and their specifications. Secondly, architectural requirements (AR) are identified (e.g. authentication and interoperability). AR can be handled through direct resolution, integration in existing functional features and addition of new functional features. The third interacts relation is used to model the communication of features. A valid FArM interacts relation connects two features where one feature uses the other feature’s functionality; and the correct operation of one feature either alters or contradicts the behaviour of the other feature. Once all interacts relations between features are identified, features are then transformed based on the type and the number of interacts relations. The last transformation is based on the analysis of hierarchy relations between super-features and their sub-features. The sub-feature can specialize, be part of or present alternatives to their super-feature. At this transformation, invalid hierarchy relations are removed and their features remain without any sub-features. Whenever it is necessary, new hierarchy relations may be created [8]. In the last FArM iterations the system architects commit to certain architecture and they can also employ a specific architectural style by adapting components to a particular style (e.g. a plug-in or a layered architecture). !" Fig. 2. # The FArM method [8] 5) Common Variability Language (CVL): CVL has being increasingly used to improve maintainability, readability and scalability of architectural variability specifications [9], [18], [19]. CVL supports the systematic specification of variability over PLA models defined in any Domain Specific Language (DSL) based on Meta Object Facility (MOF), such that executing the variability specification with resolutions will generate a product model which conforms to the base DSL [9], as illustrated in Figure 3. CVL encompasses three models (i) the base model is described in a DSL based on MOF; (ii) the CVL variability model defines variability on the base model; and (iii) the resolution model defines how to resolve the variability model to create product models in the base DSL [9]. A CVL library can be used to represent additional variant models in the base DSL. Due to space constraints, the base, variability, and resolution models are described in detail only in Section IV, in the context of the case study. Fig. 3. CVL transformation (or materialisation) [9] III. A M ODEL -D RIVEN I NFRASTRUCTURE FOR P RODUCT L INE A RCHITECTURE D EVELOPMENT We present a model-driven infrastructure that combines existing methods, tools, models and languages to implement 137 Fig. 4. A semi-automated model-driven method for developing product line architectures PLAs, as illustrated in Figure 4. The infrastructure encompasses a semi-automated engineering method that supports a step-by-step high-level refinement of models into lower level abstractions. In summary, the infrastructure firstly enables the identification and specification of common and variable requirements and features of a SPL (Activities 1 and 2). Secondly, features are realized by a component-based PLA (Activities 3-5). Finally, features and their respective components are chosen which allowing for instantiation of a final product (Activity 6). The resulting feature model, PLA model and variability model are persisted in XMI (XML Metadata Interchange) format, the standard proposed by the OMG for MOF metadata exchange. The main activities of the proposed method are described in detail in the following. A. Activities 1-2: To Specify Use Case and Feature Models These activities are concerned with the domain requirements engineering which encompasses all activities for eliciting and documenting the common and variable requirements of the SPL [6]. 1) Specification of Use Case Models: The input for this activity consists of the product roadmap and domain knowledge. This may include, for example, specification, (pseudo) codes and data flow diagrams related to the products needed. The output comprises of reusable, textual and model-based requirements. In particular, for specification of use cases, we suggest the template proposed by Gomaa [12] (Section II-B1). 2) Specification and Validation of Feature Models: The input for this activity is mainly composed by the use case descriptions. The output comprises of a feature model and feature specifications. At this stage, a feature model is specified in order to differentiate among members of the SPL (Section II-B2). Use cases and features can be used to complement each other, thus enhancing variability descriptions [12]. A software engineer is responsible for specifying a feature model that is consistent with the use cases. In particular, use cases should be mapped to the features on the basis of their reuse properties [12]. To specify and validate features models, we adopted FaMa-FM, a largely accepted test suite to support automated analysis of feature models [20]. For instance, we test whether the feature model is valid and we verify the number of products it encompasses. B. Activity 3: To Map from Features to a PLA Model The input of this activity consists of the feature model and feature specifications. The output comprises of an initial PLA. To specify the PLA, we adopt the FArM method (Section II-B4). We emphasize that a correct specification of feature realizations is essential for the derivation of correct and intended products. This specification is frequently carried out by a software engineer who has a great understanding of the product line domain, making automation extremely difficult. Although the transformation from features to a PLA model is not automated, FArM encompasses a series of well-defined transformations on the initial feature model in order to achieve a strong mapping between features and the architecture. Consequently, the FArM method ensures that the feature model is consistent with the PLA. Variability at the PLA implies using variable configurations of components and interfaces (i.e. provided and required interfaces) [6]. To guarantee that components that need to be configured are successfully coordinated, we adopt a component framework [6], [8]. Variation points are represented by locations in the framework where plug-in components may be added. In addition, variants are realized by specific choices of the plug-in. The explicit integration of plug-in mechanisms in the PLAs reduces the effort for the composition of the final products [8]. C. Activity 4: To Specify Architectural Variability The input of this activity comprises of the transformed feature model and the PLA model (i.e. the models resulting 138 from FArM transformations - Activity 3). The output consists of a CVL variability model specifying architectural variability as first-class entities. Although FArM is essential to drive the specification of the PLA, it does not systematically define the details of the transformations required to get a specific product model according to a resolution model. To overcome this gap, our solution employs CVL to specify software variability explicitly and systematically at the PLA model (Section II-B5). CVL ensures the consistency among the PLA model and its products. By means of the CVL-Enable Papyrus Eclipse Plugin, the initial PLA is firstly specified in UML, a MOF-based language. The resulting PLA model encompasses the base and variant models, according to the CVL nomenclature (Figure 3). Second, the transformed feature model is used to specify the CVL variability model by using the CVL diagram (the CVL Eclipse Plug-in). This plug-in also enables the definition of how model elements of the base model shall be manipulated to yield a new product model, as exemplified in Section IV. D. Activity 5: To Translate PLAs to Implementation Models The input of this activity comprises of the UML-based PLA model. The output consists of a Java source model encompassing the COSMOS* sub-models (Section II-A) for each architectural component and connector of the PLA model. The PLA model (from Activity 3) is used as the basis for model transformation from component-based software architectures to detailed design of software components. In particular, we adopt the Bellatrix case tool [14] to automate the generation of Java source code in COSMOS* from UML component based-models specified in XMI. By automatically generating Java source code in COSMOS*, we ensure that the PLA implementation is consistent with the PLA model. E. Activity 6: To Generate Product Models The input of this activity comprises of the UML-based PLA, the CVL variability model and the COSMOS*-based source code. The output consists of a product model architecture and its respective COSMOS*-based implementation. Software components, interfaces, and other software assets are not created anew. Instead, they are derived from the platform by binding variability [6], [9]. At this stage, a software engineer is able to configure a specific product by choosing the set of features required. For instance, it is necessary to create a CVL resolution model (Section II-B5). The realizations of the chosen features are then applied to the base model to derive the product model by using the CVL Eclipse Plugin. That is, by using the product line and resolution models, specifics products are obtained by CVL model-to-model transformation, which is in charge of exercising the builtin variability mechanisms of the PLA model. The product architecture implementation is also derived by selecting the appropriate COSMOS* components and connectors realising the product model. IV. E VALUATION We present an illustrative application of the proposed solution to demonstrate its feasibility, effectiveness and present a concrete example of our activities in practice. A. A Family of Software Fault Tolerance Techniques for SOAs We show the development of a PLA to support a family of software fault tolerance techniques based on design diversity applied to SOAs [21]. In summary, these techniques create fault-tolerant composite services, FT-compositions, that leverage functionally equivalent services, or alternate services [22]. We felt this example was well suited to demonstrate our solution as existing software fault tolerance techniques differ in terms of (i) quality requirements (e.g. execution time, reliability and financial costs), and (ii) types of alternates’ results that they are able to adjudicate (e.g. simple or complex data types) that makes different techniques more suitable in different contexts [21], [23]. Mass customization using a PLA to support multiple fault tolerance techniques can be employed to address different demands on solutions for FT-compositions. We consider the following fault tolerance techniques: Recovery Block (RcB), N-Version Programming (NVP) and NSelf-Checking Programming (NSCP). In addition, we also take into account the RcB’ and NVP’ techniques, indicating, respectively, the adoption of parallel execution of the multiple versions combined with Acceptance Tests, and, the fully sequential execution of the multiple versions combined with voters [23]. In the following, we describe the activities and models presented in Figure 4, but individual use case descriptions are omitted due to space constraints. For the specification of use cases, we took into account a set of roles and functionalities that the service composition, in general, should encompasses for the aggregation of multiple services into a single composite service (e.g. interoperability capabilities; autonomic composition of services and; QoS-aware service compositions; and business-driven automated compositions) [24]. Additional information for use case specification was taken from domain knowledge of software fault tolerance techniques based on design diversity such as the various types of adjudicators and their operations [3], [25], [23], [21]. 1) The Feature Model: Features were derived from use cases descriptions (Figure 4 - Activity 1). In Figure 5 we show an excerpt of the resulting feature model. The feature model is an extension of the model proposed by Brito et al. [3] and the one of our previous work [22]. In comparison with these previous models, our revised feature model (i) identifies different types of adjudicators (these subfeatures were omitted in the figure to avoid an unreadable feature model); (ii) explicitly distinguishes between different sequential execution schemes; (iii) allows the combination of a parallel execution scheme with acceptance tests (for instance, the RcB’ technique) and the combination of a fully sequential scheme with voters (for instance, the NVP’ technique); and (iv) explicitly states some aspects that are specifics to FT-compositions. The Executive feature aims to orchestrate a software fault tolerance technique operation. It is composed of seven mandatory features. The Consistency of input data feature presents how the consistency of data is achieved, either implicitly through backward error recovery, or explicitly through a syn- 139 chronization regime [23], [3]. The Execution scheme feature represents the three possible ways for executing the alternate services: conditionally or fully sequential execution and parallel execution [23], [21]. The Judgement on result feature presents how the acceptance test should be performed, either with an absolute criteria (involving the result of only one alternate service), or a relative criteria (involving the results of more than one alternate service) [3], [25]. The Adjudicator feature captures a set of alternative features related to the different ways that can be employed for detecting errors: by acceptance testing (ATs), voting or comparison [3], [23], [21]. The Number of Alternate Services for tolerating f sequential faults represents the number of alternate services to tolerate f solid faults [23], [3], [25]. A solid software fault is recurrent under normal operation or cannot be recovered [25]. In contrast, a soft software fault is recoverable and has a negligible likelihood of recurrence. Provided there are no fault coincidences, an architecture tolerating a solid fault can also tolerate a (theoretically) infinite sequence of soft faults [25]. The Alternate Services represent which alternate services will be employed as part of the FT-composition. Alternate services are defined by a common shared terminology so that service selection can be achieved automatically [24]. The Suspension of operation delivery during error processing feature indicates whether the error recovery technique suspends the execution when an error is detected [23], [3], [25]. In case the execution is suspended, it is also necessary to define what the purpose of the suspension is: either for re-executing the target operation or only for switching to another result [3], [25]. According to the FaMa-FM [20] tool, the proposed feature model (Figure 5) is a valid one and it encompasses 36 products (taking into account the different subtypes of adjudicators). 2) The Product Line (PLA) Architectural Model: We adopt the FArM (Section II-B4) to map from features to a PLA (Figure 4 - Activity 3). Regarding the first transformation, our initial feature model contains neither architecture-related nor quality features; consequently, no transformations were performed at this stage. With respect to the second transformation, we identified one architectural requirement: interoperability. Products should provide its main functionalities via services that are defined by means of well-defined interfaces that should be neutral, platform- and language-independent. These requirements were used to enhance the description of the execution scheme and adjudicator features that should provide interoperable services. Related to the third transformation, we analysed interacts relations, as follows. The Judgement on Result and Suspension of Service Delivery during Error Processing are used, in fact encompassed by, the Adjudicator and Execution Scheme features. Moreover, we integrated the Implicit Consistency of Input Data feature into the Conditional and fully sequential execution scheme features and Explicitly Consistency of Input Data into Parallel Execution Scheme due to their usage interacts relations. Therefore, execution schemes must provide all alternate services with exactly the same experience the system state when their respective executions start [23]. Regarding the fourth transformation proposed by the FArM method, the # of alternative services for tolerating f sequential faults feature and its subfeatures do not present a valid hierarchy relation - they simply represent a value assignment. We decided to remove theses features from the transformed model. Due of this removal, we enhance the specification of the adjudicators by explicitly specifying the number of alternate services each type of adjudicator should ideally employ to tolerant f sequential faults. 3) The CVL Models: At this stage, CVL was adopt to explicitly specify architectural variability (Figure 4 - Activity 4). Figure 6 shows an excerpt of the transformed feature model and PLA (from FArM transformations). We have omitted architectural connectors and interface names for the sake of clarity. More specifically, the transformed feature model is represented by means of a variability specification tree (VSpec) and the PLA model by using the base and variant models. VSpec represents tree structures whose elements represent choices bound to variation points. To generate product models, these choices are resolved by a resolution model and propagated to variation points and the base model [9]. The base and variant models are represented by means of software component-based architectures by using UML. Through CVL model-to-model transformation we are able to automatically generate UML software architectures of the target software fault tolerance techniques, which are our products. We represent the alternate services as software components to keep an explicit representation of them. In general terms, VSpec trees are similar to feature models and deciding choices is similar to selecting features [9]. In the VTSpec, in Figure 6, choices are represented by rounded rectangle with their names inside. A non-dashed link and dashed link between choices indicates that the child choice is, respectively, resolved according to its parent [9] and is not implied by its parent. For example, if the Executive is resolved to true, its sub-features are also resolved to true, whereas when Execution Scheme is true, its sub-features can be resolved either positively or negatively. Choices have a group multiplicity, indicated by a triangle, with a lower and an upper bound (e.g. we can select up to n alternate services and exactly one adjudicator). A variation point is a specification of concrete variability in the base model and it defines specific modifications to be applied to the base model during materialisation. They refer to base model elements via base model handles and are bound to elements of the VSpec [9]. For instance, we represent variability on the base model by means of FragmentSubstituition. A FragmentSubstitution is a choice variation point which specifies that a fragment of the base model, called the placement, may be substituted for another, called the replacement. As illustrated in Figure 6, we created auxiliary object models, which are not transformed to source code, to define placements. The replacement objects, or variant models, in turn, can be found into the CVL Library. The resolution model is a structural replica of the corresponding VSpec model such that choices are positively or negatively resolved [9], as exemplified in the Figure 6. In par- 140 ! ' % & '( ) ! * + . $" " ! # ! , # ! " ! / - * Fig. 5. - " " A Partial Feature Model for Software Fault Tolerance Techniques Applied to SOA (notation proposed by Ferber et al. [17]) Fig. 6. Using CVL to explicitly and systematically specify and resolve variability at the PLA model ticular, we always assume that a resolution model starts with a set of resolutions - choices related to the mandatory features (i.e. Executive, Adjudicator, Selector and Execution Scheme) are always set to true, whereas alternative and multiple choices must be resolved to true or false. In Figure 6, we select the Parallel execution scheme, the Service 1, Service 2 and Service 3 as alternate services and we choose to judge the results from the alternate services by using a median voter (Median Vt). Therefore, after executing the CVL transformation, as a resolved model, we have an architectural specification of the N-Version Programming (NVP) with a median voter. Finally, we emphasize that the CVL supports the definition of OCL (Object Constraint Language) constraints among elements of a VSpec to define features constraints [9]. For example, for the conditionally sequential execution scheme, we should specify that this execution scheme implies in an acceptance test as an adjudicator, thus, being able to discard invalid configurations. In Figure 7, which is semantically equivalent to the Figure 6, there is an excerpt of the models we generated by means of CVL Eclipse Plug-ins. From left to right, we have the following models. First, we defined elements of the CVL library, for instance, variant models related to the different types of adjudicators, execution schemes and alternate services. Secondly, we have the base model which mainly encompasses the mandatory features (or choices in CVL notation), with 141 the exception of auxiliary elements defined solely to support the specification of placement fragments. Thirdly, we have the model representing the software architecture of the NVP software fault tolerance technique employing a median voter as an adjudicator. Finally, we have the variability specification tree in which variation points are defined and linked to the base model and elements from the CVL library. In the variability specification tree, the highlighted elements are part of the resolution model in which is specified the choices related to the instantiation of the NVP with median voter (Figure 4: Activity 6). The CVL tool also supports validation operations and the defined variability model was validated successfully. 4) The PLA Implementation: Bellatrix was used to materialise through model-to-model transformation the elements of the UML-based PLA to JAVA source code in COSMOS* (Figure 4 - Activity 5). V. L ESSONS L EARNED We proposed a model-driven and semi-automated infrastructure to define and manage PLAs. Our solution encompasses a model-based method (i) to specify common and variable use cases as proposed by Gomaa [12]; (ii) to specify a feature model [17]; (iii) to map features to the PLA using FArM [8]; (iv) to specify variability at the architectural level using CVL [9]; (v) and to automate the generation of code related to the software architecture using COMOS* [13]. These reused approaches have been validated, are well-documented and explored in other domains [8], [9], [18], [19], [12]. Differently, in this work we document and explore how these approaches and related tools can be used together to develop PLAs. In particular, we employ this method to define a family of software fault tolerance techniques for fault-tolerant service compositions. We discuss lessons learned from executing this case study. Activities 1 and 2 (Figure 4) were adopted to specify common and variable requirements, which was essential to achieve a better understanding of the commonality and the variability of the intended SPL and to define the set of applications the SPL is planned for. To map from use cases to features is an error-prone and time-consuming task. Although it is a research challenge to fully automate this task, it could be greatly benefitted if an initial feature model could be automatically generated by analysing reuse properties of the use cases. Under this circumstance, a software engineer could incrementally refine this initial feature model by specifying appropriate many-to-many associations [12]. Activity 3 specifies a feature-oriented development method to map features to architectural components using FArM. The transformation from the feature model to the PLA was smoothly, although it is not automated. We could observe that to map a feature to the PLA can be systematically performed by carrying out the series of FArM transformations for the feature at hand [8]. Furthermore, due to the encapsulation of the feature business logic in one architectural component, the effects of removing and updating features were localized. As claimed by Sochos et al. [8], we noticed that FArM improves maintainability and flexibility of PLAs. Nevertheless, FArM does not (i) define the details of the transformations required in order to generate product models in accordance with feature configurations; and (ii) provide clear guidelines on how to describe all the traceability links among the initial and transformed feature models and the resulting PLA. Activity 4 uses CVL to specify software variability as a firstclass concern. The CVL Eclipse Plug-in greatly facilitates the specification and resolution of software variability. By specifying variability systematically, we are able to produce the right product models automatically given a resolution model encompassing a set of preselected features (Activity 6). This formalization of product creation process improves analyzability and maintainability [9], [18], [19]. In addition, by using CVL, the specification of the PLA using UML does not have to be extended or overloaded with variability. The base and variant models are oblivious to the CVL model while the CVL model refers to objects of the base model [9]. Because of this separation of concerns, there may be several variability models applying to the same architectural model which also facilitates the evolution of these product line assets [9]. We noticed that there are several strategies for defining a base model (Figure 3). One obvious choice is to define a complete model where CVL can remove features to get a specific product model [19]. Another choice is to adopt a minimal CVL model such that a product model will be generated by adding features to the base model [19]. We preferred to use neither maximum nor minimum, but somewhere in between. We created auxiliary elements in the base model to represent placement fragments. The variant models (found in the CVL Library) were then used as replacements for these placement fragments. This seemed to be the most practical way to specify fragment substitutions in our software solution as it reduces the number and complexity of substitutions. Furthermore, we noticed that using the transformed feature model, which is composed exclusively by functional features, and the PLA from FArM was essential to guarantee a compact variability specification and product realization layers as they are required by CVL. It is due to the strong mapping, ensured by FArM, from features to software components. Using CVL with a minimal feature specification layer is a sufficient condition to make a new product as quick as possible [18]. The learning curve of using the proposed method is an important consideration due to the different types of tools and documentation that is available for the activities. The specification of use cases was easily performed as Gomaa [12] presents a template for use case specifications. With respect to the FaMa-FM, XML (eXtensible Markup Language), Lisp or plain text can be used to load and save feature models [20]. Although there are many examples available illustrating the feature model notations employed by FaMa-FM, in fact, a GUI editor tool support is missing. With respect to FArM, the transition from a feature model to a PLA is explained by the authors in a short but comprehensive way as they provide a running example illustrating each FArM transformation [8]. CVL is a relatively small language with clearly defined 142 Fig. 7. Using CVL Eclipse Plug-in to specify architectural variability semantics making it less time-consuming to learn it [18]. By being able to use the base DSL editor to select and highlight fragments, it is not required to know the lower level implementation details of CVL [18]. In our case, the definition of fragments and substitutions on base models was intuitive as we were familiar with our architectural model and the Papyrus Eclipse Plug-in. CVL Plug-in requires ‘an old’ Eclipse version and some compatible libraries [9]. To find some of the required libraries were a time-consuming task. The COSMOS* implementation model also defines clear semantics which are in fact representative of software component models. It was also straightforward to implement componentbased software architectures specifically due to the support of Bellatrix (Activities 5 and 6). The tasks to ensure consistency among models are not fully automated. Under some refinements from high-level models into lower level abstractions, a software engineer is responsible for ensuring that models are consistent. Because it is an errorprone activity, we identified useful guidelines to facilitate the accomplishment of this task (e.g. the method by Gomaa [12] to map from use cases to features and FArM to map from features to PLAs). Nevertheless, despite the smooth transition from the feature model to the architecture, the process could be greatly facilitated if there was one tool that provided support for all the activities. This would achieve greater consistency, traceability among models, efficiency and familiarity within the development environment which would lessen the learning curve. Unfortunately, it is difficult to find the support for specific modelling needs (e.g. to model optional/ mandatory features and architectural variability etc.) at the same time as finding compatible tools that support a large range of the specified activities. VI. R ELATED W ORK Zhang et al. [10] propose a tool to compare the existing potential product models to identify reusable assets such as model fragments. A preliminary PLA model and a CVL variability model can be automatically specified based on the comparison results. Our solutions complement each other as we can employ the CVL Compare tool to generate the initial PLA model and they can use Bellatrix to generate code automatically. The solution by Chastek el al. [11], encompasses a set of process, and tools to ensure that models are effectively coordinated. Our solution can be regarded as an instantiation of their method engineering, which is described in a higher level of abstraction. Their solution does not allow the automatic generation of product models given a resolution model, neither the automatic translation of the PLA model to an implementation model. Unlike our solution, Chastek el al. [11] present general guidelines to create test assets by using the JUnit test framework. Azanza et al. [1] present a MD-PLE, but they focus on extracting product line artefacts from existing software applications. Our solution, in fact, can be regarded as a proactive approach for mass-customization [7], which is appropriate when the requirements for the set of products needed are well defined and stable [7]. Buchmann et al. [26] propose a MD-PLE for software configuration management systems. They are able to instantiate an application-specific system, which is executable, according to the selection of features. We intend to reuse their solution to refine the COSMOS*-based implementation of PLAs. To specify architectural variability they develop a tool that maps features to corresponding domain model elements by manually annotating elements of the domain model with feature expressions. To specify and 143 manage these annotations is an error-prone activity. At this point, we adopt CVL, a well-defined language for specifying and resolving software variability. A number of works have been using CVL to specify and resolve software variability [9], [18], [19]. In particular, we refer to the work by Svendsen et al. [18] for a detailed description of how to use CVL, its main concepts, the CVL tool support and a discussion on different strategies to choose a base model and to create the CVL Library. One of the main purposes of that work [18] is to show the applicability of the CVL to support the transformation of an original (e.g. a product line model) into a configured, new product model. VII. C ONCLUDING R EMARKS We proposed a comprehensive, semi-automated, systematic and model-driven engineering method for product line architecture (PLA) development ensuring that models are effectively coordinated. The method is supported by an infrastructure that encompasses a set of existing processes, languages and tools, which are consistent and interoperable. The process involves extracting common and variable functional requirements of the product line using use case models. To differentiate among members of the product line, features are extracted to form a feature model and subsequently mapped to a component-based PLA model. Finally, using preselected features, individual component-based product models are generated through model-to-model transformations. Our solution employs Common Variability Language to specify and resolve architectural variability and COSMOS*, a component implementation model, to realize the PLA. We presented well known Eclipse Plug-ins that can be used to manage and validate the models and to support model-to-model transformations. To exemplify and evaluate the proposed solution, we employ it to develop a PLA to support family of software fault tolerance techniques for fault-tolerant composite services. The results from this preliminary evaluation show the feasibility and effectiveness of the proposed infrastructure to develop PLAs and suggest directions for future work. We encountered some problems related to the process of using multiple techniques together. These limitations include issues with consistency of artefacts, traceability among models and familiarity within the development environment. To overcome these drawbacks, we intend to employ an integrated environment that could handle the entire process instead of using independent tools. ACKNOWLEDGMENT This research was sponsored by UOL (www.uol.com.br), through its UOL Bolsa Pesquisa program, process number 20120217172801. Fernando is supported by CNPq (306619/2011-3), FACEPE (APQ-1367-1.03/12), and by INES (CNPq 573964/2008-4 and FACEPE APQ-1037- 1.03/08). R EFERENCES [1] M. Azanza, J. D. Sosa, S. Trujillo, and O. Dı́az, “Towards a Process-Line for MDPLE,” in 2nd Int. Work. on MDPLE, 2010. [2] K. Czarnecki, M. Antkiewicz, C. H. P. Kim, S. Lau, and K. Pietroszek, “Model-driven software product lines,” in Proc. of OOPSLA ’05, 2005, pp. 126–127. [3] P. Brito, C. Rubira, and R. de Lemos, “Verifying architectural variabilities in software fault tolerance techniques,” in Proc. of WICSA/ECSA 2009, sept. 2009, pp. 231 –240. [4] X. Zhang and B. Møller-Pedersen, “Towards correct product derivation in model-driven product lines,” in Proc. of the 7th SAM’12, 2013, pp. 179–197. [5] A. Magalhaes, J. David, R. Maciel, B. Silva, and F. da Silva, “Modden: An integrated approach for model driven development and software product line processes,” in Proc. of SBCARS’11, 2011, pp. 21–30. [6] K. Pohl, G. Böckle, and F. J. v. d. Linden, Software Product Line Engineering: Foundations, Principles and Techniques. Secaucus, NJ, USA: Springer-Verlag New York, Inc., 2005. [7] C. W. Krueger, “Easing the transition to software mass customization,” in Proc. of the 4th Int. Work. on Soft. Product-Family Engineering, 2002. [8] P. Sochos, M. Riebisch, and I. Philippow, “The feature-architecture mapping (farm) method for feature-oriented development of software product lines,” in Proc.the 13th ECBS, 2006, pp. 308–318. [9] C. V. Language, “start — common variability language,” 2012, [Online; accessed 24-April-2013]. [Online]. Available: http://www.omgwiki.org/ variability/doku.php?id=start&rev=1351084099 [10] X. Zhang, O. Haugen, and B. Moller-Pedersen, “Model comparison to synthesize a model-driven software product line,” in Proc. SPLC11, 2011, pp. 90–99. [11] G. Chastek, P. Donohoe, J. McGregor, and D. Muthig, “Engineering a production method for a software product line,” in Proc. of the 15th SPLC, 2011, pp. 277–286. [12] H. Gomaa, Designing Software Product Lines with UML: From Use Cases to Pattern-Based Software Architectures. Redwood City, CA, USA: Addison Wesley Longman Publishing Co., Inc., 2004. [13] L. A. Gayard, C. M. F. Rubira, and P. A. de Castro Guerra, “COSMOS*: a COmponent System MOdel for Software Architectures,” IC, University of Campinas, Tech. Rep. IC-08-04, February 2008. [14] R. T. Tomita, F. C. Filho, P. A. Guerra, and C. M. F. Rubira, “Bellatrix: An environment with arquitectural support for component-based development (in portuguese),” in In Proc. of the IV Brazilian Work. on Component-Based Development, 2004, pp. 43 – 48. [15] L. Bass, P. Clements, and R. Kazman, Software Architecture in Practice, 2nd ed. Boston, MA, USA: Addison-Wesley Longman Publishing Co., Inc., 2003. [16] K. Czarnecki and U. W. Eisenecker, Generative programming: methods, tools, and applications. New York, NY, USA: ACM Press/AddisonWesley Publishing Co., 2000. [17] S. Ferber, J. Haag, and J. Savolainen, “Feature interaction and dependencies: Modeling features for reengineering a legacy product line,” in Proc. SPLC 2002, 2002, pp. 235–256. [18] A. Svendsen, X. Zhang, R. Lind-Tviberg, F. Fleurey, O. Haugen, B. Møller-Pedersen, and G. K. Olsen, “Developing a software product line for train control: a case study of cvl,” in Proc. of the 14th Int. Con. on Software product lines: going beyond, 2010, pp. 106–120. [19] E. Rouille, B. Combemale, O. Barais, D. Touzet, and J.-M. Jezequel, “Leveraging cvl to manage variability in software process lines,” in Proc. of the 19th APSEC, 2012, pp. 148–157. [20] “Fama framework,” Apr 2013, http://www.isa.us.es/fama. [21] A. S. Nascimento, C. M. F. Rubira, R. Burrows, and F. Castor, “A systematic review of design diversity-based solutions for fault-tolerant soas,” in Proc. the 17th EASE, 2013, pp. 107–118. [22] A. S. Nascimento, C. M. F. Rubira, and J. Lee, “An SPL approach for adaptive fault tolerance in SOA,” in Proc. of the 15th SPLC, vol. 2, 2011, pp. 1 – 8. [23] L. L. Pullum, Software fault tolerance techniques and implementation. Norwood, MA, USA: Artech House, Inc., 2001. [24] M. P. Papazoglou, P. Traverso, S. Dustdar, and F. Leymann, “Serviceoriented computing: State of the art and research challenges,” Computer, vol. 40, no. 11, pp. 38 – 45, 2007. [25] J. C. Laprie, C. Béounes, and K. Kanoun, “Definition and analysis of hardware- and software-fault-tolerant architectures,” IEEE Computer, vol. 23, no. 7, pp. 39 –51, Jul. 1990. [26] T. Buchmann, A. Dotor, and B. Westfechtel, “Mod2-scm: A modeldriven product line for software configuration management systems,” Infor. and Soft. Technology, vol. 55, no. 3, pp. 630 – 650, 2013. 144 A Reference Architecture based on Reflection for Self-adaptive Software Frank José Affonso Elisa Yumi Nakagawa Dept. of Statistics, Applied Mathematics and Computation Univ Estadual Paulista - UNESP Rio Claro, SP, Brazil [email protected] Dept. of Computer Systems University of São Paulo - USP São Carlos, SP, Brazil [email protected] Abstract—Self-adaptive Software (SaS) presents specific characteristics compared to traditional ones, as it makes possible adaptations to be incorporated at runtime. These adaptations, when manually performed, normally become an onerous, errorprone activity. In this scenario, automated approaches have been proposed to support such adaptations; however, the development of SaS is not a trivial task. In parallel, reference architectures are reusable artifacts that aggregate the knowledge of architectures of software systems in specific domains. They have facilitated the development, standardization, and evolution of systems of those domains. In spite of their relevance, in the SaS domain, reference architectures that could support a more systematic development of SaS are not found yet. Considering this context, the main contribution of this paper is to present a reference architecture based on reflection for SaS, named RA4SaS (Reference Architecture for SaS). Its main purpose is to support the development of SaS that presents adaptations at runtime. To show the viability of this reference architecture, a case study is presented. As result, it has been observed that RA4SaS has presented good perspective to efficiently contribute to the area of SaS. I. I NTRODUCTION Software systems have occupied an important role in the modern society, operating in several segments, including public and private institutions, airports, communication systems, and so on. These systems must be sometimes prepared to their functions in normal operation and available 24 hours per day, seven days per week. Moreover, most of them must be prepared to operate in adverse conditions, maintaining their integrity of execution. Therefore, characteristics such as robustness, reliability, scalability, customization, self-organization, and self-adaptation are increasingly required by these systems. In particular, the last three characteristics match with a specific type of software systems, which represents the self-adaptive systems1 . These systems make possible incorporation of new features in runtime, i.e., adaptations without interrupting the execution [1], [2], [3], [4], [5], [6]. Software adaptation, when manually performed, becomes an onerous (with regard to time, effort, and money) and error-prone activity, mainly due to involuntary injection of uncertainties by developers [7], [8], [9], [10]. To overcome these adversities, automated approaches have been adopted, as they are an alternative to maximize the speed of SaS implementation and, at the same time, to minimize developers’ involvement [11]. 1 The term self-adaptive systems is used in different areas/domains. In this paper, it will focus only in the software domain and, therefore, it will be referenced as Self-adaptive Software (SaS). 145 In another perspective, reference architectures refer to a special type of software architecture that have become an important element to systematically reuse architectural knowledge. The main purpose of these architectures is to facilitate and guide [12]: (i) the design of concrete architectures for new systems; (ii) the extensions of systems of neighbor domains of a reference architecture; (iii) the evolution of systems that were derived from the reference architecture; and (iv) the improvement in the standardization and interoperability of different systems. Reference architectures can occur in different levels of abstraction. For example, generic reference architectures for service-oriented architectures, such as IBM’s foundation architecture [13]. On the other hand, more specific architectures can be found; for instance, architectures for econtracting [14], web browsers [15], software engineering environments [16] have been proposed. However, in spite of considerable relevance of such architectures, there is a lack of architectures that support the development of SaS. Therefore, SaS have been currently developed without concern with reuse of previously acquired knowledge and experience, such as best practices, guidelines, and better architectural styles for SaS. In this scenario, the main goal of this paper is to present a reference architecture for SaS, named RA4SaS (Reference Architecture for SaS). This architecture is mainly based on reflection, an important resource for inspection and modification of software entities2 at runtime. RA4SaS makes it possible to more systematically develop SaS, which contains two main functions: software monitoring and software adaptation at runtime without shareholders’ intervention. In order to observe the viability of this architecture, case studies have been executed. As main result, we have observed that this architecture can be considered an important contribution to the area of SaS. The remainder of this paper is organized as follows. In Section II, it is presented the background and related work. In Section III, it is presented RA4SaS. In Section IV, it is presented a case study. Finally, conclusions and perspectives of future work are presented in Section V. II. BACKGROUND AND R ELATED W ORK Reference architecture is a special type of architecture that provides major guidelines for the specification of concrete architectures of a class of systems [17]. In order to systematize 2 From this point onwards, SaS will be also referred to as software entities or simply entities. the design of such architectures, guidelines and processes have been established [18], [19], [20], [21]. Moreover, the effective knowledge reuse of reference architecture depends not only on raising the domain knowledge, but also documenting and communicating efficiently this knowledge through an adequate architectural description. Commonly, architectural views have been used, together with UML (Unified Modeling Language) techniques, to describe reference architectures. Considering their relevance as basis of the software development, a diversity of architectures has been proposed and used, including for (self-*) software, i.e., software that are capable of self-configuration, self-adaptation and self-healing, self-monitoring, self-tuning, and so on [22], [23], [24], [25]. However, there is currently no reference architecture for SaS domain, specifically architectures based on reflection. Computational reflection, or simply reflection, can be defined as any activity performed by a system on itself, being very similar to human reflection. The main goal of software systems that present reflection is to obtain information about their own activities, aiming at improving their performance, introducing new capabilities, or even solving their problems by choosing the best procedure. Therefore, the use of reflection makes possible software to be more flexible and more susceptible to changes [1], [7], [8], [26]. Following, some main uses of reflection for software adaptation are exemplified. Borde [27] proposed a technique based on reflection for software components adaptation with regard to structure and behavior. Existing functionalities are preserved and others (new requirements) are added, constituting a new software component. This technique also provides control over the number of modifications that can be applied to software components, as these components can quickly increase in size and, consequently, their reusability could be reduced. In [28], the reflection is used as a mechanism for dynamic adaptation of distributed systems. The author proposed an RMI (Remote Method Invocation) extension, named XRMI (eXtended RMI), which makes possible an application to monitor and to manipulate remote invocations among components during a dynamic adaptation. This proposal is based on the Proxy pattern, which makes possible remote objects behave in a transparent way to the application as if they are locally deployed. Tanter et al. [29] proposed a tool, named Reflex, for software dynamic adaptation based on reflection and Aspect-Oriented Programming (AOP). This tool provides both structural and behavioral facilities, adopting a reflection model, which makes possible a selective, fine-grained control of where and when reflections occur. Additionally, Janik & Zielinski [30] developed a technique to adapt non-functional requirements, leaving intact the functional level. The authors proposed an AOP extension, named AAOP (Adaptable AOP), which represents a set of adaptable non-functional aspects that are integrated into the software at runtime. According to Shi et al. [3] and Peng et al. [4], the reflection has been used successfully in the software components reuse and it has been implemented in large-scale in the reuse of software architecture and its components. Both authors divided the architecture into two levels: (i) meta level, which contains the architectural components, information describing the baselevel as architecture topology, components, and connectors; and (ii) base level, which can be considered as traditional software architecture. Thus, it is observed that there are already 146 important initiatives of using reflection in the development of software systems. With regard to related work, reference architectures and reference models (a higher level artifact compared with reference architecture) are found. Liu et al. [24] proposed a reference architecture for service-oriented computing based on self-organizing mechanism. The goal of this architecture is building self-organizing software components, while keeping the system complexity hidden from human system participants. For this, the systems must be able of ensuring transparency in the monitoring and modifications control in relation to their stakeholders. Weyns et al. [22] also proposed a reference model, named FOrmal Reference Model for Self-adaptation (FORMS), based on reflection (reflective models and reflective subsystems). In order to demonstrate its applicability, the FORMS model was instantiated in two domains: MAPE-K model [31] (composition of autonomic systems) and robotic systems (unmanned vehicle). Another reference model based on self-healing and self-optimizing mechanisms was proposed by Weyns et al. [25]. This model supports the development of decentralized SaS, such as (i) intelligent cameras that collaborate to detect and monitor traffic jams, and (ii) improvement of the quality of services via redeployment of their components. In short, this model reifies the basic constructs of SaS and extends them with additional constructs to support the engineering of decentralized SaS. Finally, Kramer & Magee [23] presented an architectural approach for self-managed systems (self-* or autonomic systems). The authors proposed an architectural model in two levels (meta and base), both with reflective characteristics. Modifications in the base-level (application) are managed by an action plan (meta-level), which determines what modifications can be implemented. These related works present evidence of the relevance of reference architectures and reference models for SaS; therefore, efforts to establish a reference architecture is quite interesting, including architectures that explore reflection for SaS development. III. E STABLISHING RA4S A S RA4SaS aims at supporting the development of SaS that presents as main functionalities the monitoring and adaptation at runtime without the perception of involved (stakeholders). In order to establish this architecture, a process to build reference architectures, named ProSA-RA (Process based on Software Architecture - Reference Architecture) [19] - illustrated in Figure 1, was used. In short, to establish reference architectures by using ProSA-RA, information sources are firstly selected and investigated (in Step RA-1) and architectural requirements are identified (in Step RA-2). After that, an architectural description of the reference architecture is established (in Step RA-3) and evaluation of this architecture is conducted (in Step RA-4). Following, each step in presented in more details in order to establish the RA4SaS. A. Step RA-1: Information Source Investigation In this step, information sources were identified, aiming to elicit RA4SaS requirements. Different sources were considered, mainly related to reference architectures and reference Table I. Figure 1. T HE MODEL 5W1H AND ADAPTATION SCOPE Questions Scopes What will be adapted? Attributes, software entities, architectural components, etc. Where will the adaptation occur? Software system, software architecture, architectural components, etc Who will perform the adaptation? Autonomous systems, humans, both, etc When will the adaptation be applied? How often, anytime, constantly, etc Why will it be adapted? Errors in project, new requirements (users or technology), etc How will it be adapted? Is there any action plan for adaptation? Was the adaptation planned?, etc Outline Structure of ProSA-RA (Adapted from [19]) models for SaS domain. Thus, the two sets of sources were: (i) guidelines to development of SaS; and (ii) reference architectures and reference models for self-adaptive systems. The purpose of the second source is to understand: (a) how the dynamic behavior is represented of reference architectures and reference models; and (b) what type of knowledge/element is used to design the reference architectures and reference models for SaS. It is worth highlighting that to identify information sources for these sets, it was conducted a systematic review3 . In this paper, for reasons of space, the details of this systematic review are not presented. In short, after setting the review protocol and conducting the selection process, 27 primary studies were included as relevant for the context of this work. From these studies, information was extracted to be basis of our reference architecture, as presented below and grouped in Set 1 and Set 2. Finally, it is noteworthy that the use of systematic reviews to support gathering of information sources and requirements elicitation is already investigated elsewhere [34], [35]. Set 1 - Guidelines to development of SaS. As previously mentioned, SaS has specific characteristics compared to a traditional one, mainly regarding adaptations at runtime. Aware to these characteristics, authors have proposed works intending to minimize the deficiency existing in this area. Based on these works, it is possible to identify and to establish a panorama of the area of (self-*) systems. In order to establish this panorama and to elicit the adaptation requirements of software systems, it was opted to use the Kipling method (5W1H)4 , cited by Salehie & Tahvildari [9]. Table I summarizes the questions of this method related to adaptations in a software system and the possible scopes of action. Information about the scopes will be useful during the establishment of RA4SaS. Still on this panorama, some authors, such as Kramer & Magee [23], Salehie & Tahvildari [9] and Han & Colman [36], report three common problems on the area of (self-*) software systems: (i) the use of term (self-*) systems in different domains (e.g., engineering, computer science, medicine, etc.); (ii) existence of many synonyms for (self-*) properties; and (iii) difficulty of implementing software systems with two or more (self-*) properties. Indirectly, these problems 3 Systematic review is a technique proposed by Evidence-Based Software Engineering that enables to have a complete and fair evaluation about a topic of interest [32], [33]. 4 This method has six questions (What, Where, Who, When, Why and How) and is called 5W1H, from “Six Honest Men” poem of R. Kipling in the book “Just So Stories”. Penguin Books, London, 1902. 147 imply difficulties to establish development standards and to consolidate the research area. In contrast, efforts of researchers, such as Coulson et al. [2], Bradbury et al. [37], Salehie & Tahvildari [9], Andersson et al. [8], and Villegas et al. [38], have been directed to alleviate these problems and try to reverse this scenario, establishing a relationship between the primitive terms and their synonyms with quality factors of software product [39], [40]. Table II summarizes the terms that were considered in the RA4SaS design, aiming at aggregating the related quality factors. Table II. S ELF -* PROPERTIES VS QUALITY FACTORS Primitive term Synonyms Quality factor self-configuring no Functionality Integrity Maintainability Portability Scalability Usability self-healing self-diagnosing self-reparing Availability Integrity Maintainability Reliability Survivability self-optimizing self-adjusting self-tunning Efficiency Functionality Integrity Maintainability Performance Throughput self-protecting self-defense Availability Confidentiality Functionality Integrity Reliability Others terms, such as self-managing, self-governing, selfmaintenance, self-control, and self-organizing represent global properties of self-adaptive software and, therefore, they did not considered in RA4SaS. Besides that, the terms selfawareness, self-monitoring, and self-situated are considered primitive properties, as presented in [8], [9], [38] and also not considered. Set 2 - Reference architectures for self-adaptive systems. As previously mentioned, a systematic review was conducted and its objective was to identify the following interests: (i) how the dynamic behavior is represented in reference architectures and reference models; and (ii) which type of knowledge/element is used to design the reference architectures and reference models for SaS. From 27 obtained studies, information for the both interests was extracted. For interest (i), a set of six categories was created: 1) 2) 3) 4) 5) 6) Block Diagrams (BD): a system diagram in which the principal parts or functions are represented by blocks connected by lines that show the relationships of the blocks; Formal Methods (FM): is process that uses formal methods; for example, algebra CSP (Communicating Sequential Processes), Z notation, etc; Formal Notation for Business Process (FN): a graphical representation for specifying business processes in a business process model; for example, BPMN (Business Process Modeling Notation), BPEL (Business Process Execution Language), etc; Informal Notation (IN): is a kind of notation that does not follow a formal or semi-formal method or methodology; Layers Diagrams (LD): is a high-level representation for organizing physical artifacts. These diagrams describe major tasks that the artifacts perform or the major components. A layer can be subdivided in sublayers that describe more detailed tasks; and UML Diagrams (UD): a graphical language for visualizing, specifying, constructing, and documenting. 10) that enable the dynamic behavior of self-managed software systems, and; Supervisor systems: they are systems responsible for monitoring the operation of another system. The supervisors control changes (structural and/or behavioral) in supervised systems. Figure 2 shows a mapping between the six categories related to representation and ten categories related to type of knowledge, as well as the number of found studies. It is noteworthy that some studies used more than one representation and/or type of knowledge (note the solid lines that connect the bubbles). Analyzing this mapping, it is possible to observe that the following combinations are highlighted: (i) block diagrams and autonomous subsystems; (ii) block diagrams and rule base; (ii) formal methods and rule base; and (iii) layers diagrams and computational reflection. In this direction, this mapping aims at guiding the choice of techniques used to represent the RA4SaS, as well as the knowledge/elements contained in this architecture. For interest (ii), we identified the types of knowledge/elements used to design the dynamic behavior in reference architectures. We organized them in ten categories: 1) 2) 3) 4) 5) 6) 7) 8) 9) Action plane: it represents an actions sequence to execute a specific activity in the software system; Agents: they encompass the software agents, intelligent agents, and autonomous agents. They are software entities independent from the software system, but that cooperatively work to assist the dynamic behavior; Autonomous subsystems: they are subsystems that implement the model (Monitor - Analyze - Plan - Executor) defined the architecture model of autonomic computing; Computational Reflection: it can any activity performed by a system on itself, being very similar to human reflection; Nature-inspired pervasive service ecosystems: they should get inspiration from natural systems, by enabling modeling and deployment of services as autonomous individuals, spatially-situated in a system of other services, data sources, and pervasive devices; Process flow: it represents steps sequence for performing an activity. In this sequence, decisions can be taken, characterizing the changes in the dynamic behavior of software; Rule base: it represents a set of rules that can be used to represent a dynamic behavior of software. These rules define which changes can perform at runtime. In addition, these rules can be updated and others can be added as a way to insert new behaviors to the software systems and/or its architecture; Service composition: they are new functionalities that can be coupled at runtime so that the new requirements are met; Subsystems in layers: they are a set of subsystems organized in layers so that one or more activities are performed. These subsystems are autonomous units 148 Figure 2. Representation vs. Type of knowledge B. Step RA-2: Architectural Requirement Establishment Based on information identified in the previous step, it was established a set of architectural requirements for RA4SaS. This set was classified in two subsets: (i) architectural requirements of SaS domain that were obtained from Set 1; (ii) architectural requirements related to infrastructure of software systems adaptation that were obtained from Sets 1 and 2. Table III illustrates part of these requirements. First column refers to the requirement identification (R-S, i.e., Requirement related to Self-adaptive Software domain and R-I, i.e., Requirement related to Infrastructure of adaptation); second column refers to requirements description; and third column refers to concepts related to requirements. For instance, requirement RS2 (Table III) is related to Adaptation Criteria concept. This same analysis was conducted to each requirement of RA4SaS. As result, three concepts were identified: Development of SaS (DS), Representation of SaS (RS), Infrastructure of SaS (IS). This last one was specialized in Dynamic Compilation (DC), managing the Number of Adaptations (NA), Action Plan (AP), Restoring Execution (RE) state for adaptation, so on. Finally, it is worth highlighting that these concepts were established based on SaS literature, such as [8], [9], [36], [37], [38], as well as experience of the authors in SaS development. Table III. PART OF THE RA4S A S REQUIREMENTS ID Requirement Concept RS-1 The reference architecture must enable the development of SaS that has a mechanism to define the adaptation level of software systems. DS RS-2 The reference architecture must allow the development of SaS that are able to reflect on current execution state, identifying their structure and behaviors. DS RS-3 The reference architecture must enable the development of SaS that are able to keep their representation after performed an adaptation activity. RS RS-4 The reference architecture must allow the development of SaS that are able to adapt by modifying their structure and behavior. DS ... ... ... RI-1 The reference architecture must enable the development of SaS that are able to replace the software system (software entity) at runtime, which represents the capacity to perform dynamic compilation and dynamic loading of software entities without restarting the application or redeploying its components. IS-CD RI-2 The reference architecture must allow the development of SaS that are able to control the adaptation number so that do not quickly increase in size, making unfeasible for future adaptations. IS-NA RI-3 The reference architecture must enable the development of SaS that are able to establish an action plan for software adaptation, which represents a sequence of steps so that the system software is adapted at runtime. IS-AP RI-4 The reference architecture must allow the development of SaS that are able to identify and report (diagnosis) problems occurring when an adaptation activity is being performed. IS RI-5 The reference architecture must enable the development of SaS that are able to fix problems (healing) at runtime when an adaptation activity is being performed. IS RI-6 The reference architecture must allow the development of SaS that has a mechanism able to preserve its execution state (current information) when a software entity is being adapted. IS-RE ... ... ... Figure 3. General representation of RA4SaS and four additional modules: development module; action plan module; adaptation rules module; and infrastructure module. Following, it is presented a brief description of each module and the details of chosen resources in the design of RA4SaS. Development module. This module provides a set of guidelines for SaS development (software entities). These guidelines act the following phases: requirements analysis, design, implementation and evolution (adaptation). In short, software entities are developed containing only attributes, constructors, and getters and setters methods. The business logic of these entities is organized in another layer, as template of source code5 . For example, when an entity is developed, the persistence layer is generated automatically by the source code module. Later, when this entity is adapted, its persistence layer is regenerated. Thus, it is noticed that the developer only focuses efforts on the development of software entities, based on established guidelines, and the other activities are done by automatic mechanisms. This strategy optimizes the work in time and cost, and reduces the involuntary injection of uncertainties. C. Step RA-3: Architectural Design As discussed in the previous section, a reference architecture can use more than one technique of representation and/or type of knowledge. Based on these evidences, it was opted in the design of RA4SaS by the following resources: (i) types of knowledge/elements: action plane, autonomous subsystems, computational reflection, rule base, and supervisor systems; and (ii) technique of representation: block diagram, layers diagrams, and UML diagrams. These resources were considered sufficient to represent the RA4SaS and the relationship between its components, since they allow forming a well-defined reflective environment for the development and software adaptation at runtime. Figure 3 shows the general representation of RA4SaS using block diagram (modules). This architecture is composed by a core for adaptation (dotted line) 149 The evolution of the adaptive system can be considered as a special issue, since aims to capture changes in a software entity during its life cycle. Initially, when a system software or software entity is developed, indirectly are defined your goals. Similarly, when it is adapted, its goals are also modified. These facts indicate that disordered adaptations can make that entity can quickly increases in size and could not feasible their use in future systems. Thus, in order to establish a management on the adaptation number of software entities and their goals, it was opted to use a base of metrics that allow evaluating the complexity and behavior of entities. These metrics aim to control the cohesion and granularity of entities, so that they do not lose their flexibility/capacity of adaptation. To finalize the description of this module, it is noteworthy that the software entities operating in supervisor (meta-level) - supervised (base-level) model. The adaptation occurs in four steps: (i) interception of entity supervised, (ii) analysis of feasibility, (iii) establishment of an action plan (Action plan module), and (iv) implementation of changes in supervised 5 See the meaning of this template in the source code module Infrastructure module. This module aims to provide support for software entities adaptation at runtime. For this, it has a set of mechanisms that act in various contexts. Among them, three mechanisms are highlighted: (i) dynamic compiling and dynamic loading, which must be enable to replace the software entity at runtime without restarting the application or redeploying its components; (ii) diagnosis of problems, which must be enable to identify and report problems occurring when an adaptation activity is being performed; and (iii) self-healing, which must be enable to fix problems at runtime when an adaptation activity is being performed. Figure 4. Autonomous control loop (adapted from [9], [41], [42]) (base-level). Section IV will present more details of this adaptation process. Action plan module. This module aims at assisting in the adaptation activity of software entities. This module must be able to control: dynamic behavior, individual reasons, and execution state in relation to the environment. Upon existing efforts in this direction, which answer the main questions 5W1H to adapt a software entity, the proposed Generic Feedback Loop [41] and model MAPE-K (Monitor, Analyze, Plan, Execute and knowledge) [42] emerge as feasible solution to an action plan for RA4SaS. Figure 4 shows the autonomous control loop. In short, the Monitor process has mechanisms that collect data from sensors and converting them to behavioral patterns and symptoms. This process is related to Where, When and What questions in SaS. The Analyze process aims to correlate the data obtained and to model complex situations. Thus, the autonomous systems can learn from the environment and predict future situations. It also helps to identify When and Where the modifications must be applied. The Plan process is responsible to create What will be adapted and How to apply the changes in the software entity to achieve the best outcome. The Execute process must provide the mechanisms that can execute the established action plan. This process relates to the questions of How, What, and When to change. Finally, the monitors Sensors and Effectors are software entities to generate a collection of data reflecting the state of the system and to rely on in vivo mechanisms or autonomous subsystems to apply changes, respectively [9], [41], [42]. Adaptation rules module. This module is responsible for automatically extracting adaptation rule of software entity. For this, when an entity is developed and inserted into the execution environment, a metamodel (reflection module) is instantiated. Following, an automatic mechanism (named rulesFactory) is responsible for extracting this entity (metamodel) and creating a set of rules that describes its structure and behavior. Once generated, these rules are stored in the repositories (rule base) and reused when a search for adaptation is performed. Finally, these rules are written using the DROOLS6 framework template. 6 According to Salehie & Tahvildari [9], there are two approaches for adaptation of software entities: (1) internal, when an entity, through mechanisms of the programming language, can perform self-adaptation. However, this approach is limited, as it can not detect its context information; and (2) external, when a supervisor (manager) is the responsible to interpret the context information and to apply them in software entities. In general, only the second one is used in RA4SaS, since the software entities are monitored and adapted by external modules. In addition, this approach is specifically applied in items (ii) (diagnosis of problems) and (iii) (self-healing) of this module, presented previously. Core of adaptation. This structure can be considered the “heart” of RA4SaS, since there is a set of modules that is responsible for managing software entity adaptation at runtime. Basically, this core is organized in six modules: (i) search module, (ii) annotation module, (ii) state module, (iv) source code module, (v) reflection module, and (vi) adaptation module. From among these modules are highlighted the reflection and adaptation modules, since they act as software entities metarepresentation, and adaptation “orchestrator”, respectively. Following, a brief description of each module is presented. Search module. This module aims to assist in the search for software entities in the execution environment when an adaptation activity is invoked. Basically, this module has two search methods: (i) semantic metamodel7 , which can be defined as the description of the set of concepts and notations used to define a model. This description represents structure and behavior information of software entities that are transformed in a model of semantic relationship (Entity-Entity, EntityAttribute, Entity-Method, or Entity-Attribute-Method); or (ii) technical information, which are specified only in technical information of systems software (Entity, Attribute, or Method) in the template format. These templates are converted as input parameters to search in the rule repository. Annotation module. This module aims to assist the software engineer in the definition of adaptation level of software entities in the development stage. This information is used by the reflection module to identify what information can be modified (removed and/or added), thus a metadata (annotation) indicating the adaptation level supported by this entity must be present. Furthermore, this module has a functionality to verify if the annotations were inserted correctly, since without this information, the reflection module can not identify what will be adapted. State module. This module aims at preserving the current execution state of software entity. When an entity is selected 7 http://www.jboss.org/drools 150 The same metamodel that will be presented in the reflection module. for adaptation, information contained in its current state must be preserved. The entity is modified and the information is reinserted so that the execution is not interrupted. Basically, this module should have two functionalities to convert an entity into a file (.xml) and vice versa. The choice of XML (eXtensible Markup Language) to perform these operations is related to the following facilities: files handling (reading and writing), integration with different programming languages, and implementation facility. Source code module. This module aims to generate the source code of new software entity (adapted) based on instantiated metamodel in reflection module. To execute this operation, the software engineer must provide a software entity template based on the architectural pattern (logical layer, persistence layer, and others). Basically, this module should have three functionalities to generate the source code that meets the adaptation interests: (i) structural, when only one or a list of attributes should be added or removed from the entity. In this case, specifically, the getters and setters methods that manipulate these attributes are modified; (ii) behavioral, when only a list of methods should be added or removed from the entity; and (iii) structural and behavioral, when one or a list of attributes and one or a list of methods should be added or removed from the entity. Finally, it is noteworthy that this module has a mechanism for version control of entities source code, preventing them from being overlapped and keeping versions of all stakeholders. Reflection module. This module is organized in two submodules. The first one aims at assisting in the “disassembly” and “assembly” of software entity. For this, it uses annotation module to obtain the adaptation level supported by entity. With this information, the disassembly process can be started, and structural and behavioral information (attributes and methods) are recovered via reflection and inserted into a metamodel8 , which is inside of second submodule. After instantiating this metamodel, new information according to adaptation interests can be added or removed, generating a new metamodel. This new metamodel is transferred to the source code module to create the new software entity. Figure 5 shows this metamodel. As it can be observed, this metamodel makes possible the adaptation of object-oriented software systems or that make use of the structure of classes as units of software systems, as it can noticed the presence of classes (Clazz), attributes (FieldClass) methods (MethodClass), so on. This strategy can be considered relevant for developers, since they can develop adaptive software entities in a notation close to their expertise area. Adaptation module. Broadly speaking, this module can be considered as RA4SaS “orchestrator”, since it performs calls and coordinates all activities of the other modules (Core of adaptation). In closer examination, this module acts as supervisor system of software entities, monitoring their requisitions in the execution environment. For this, this module implements a well-defined process to adapt a software entity at runtime. In short, an entity is disassembled (metamodel), adapted (new information), and reassembled automatically as an “assembly line”. In addition, an important characteristic for 8 Logical information model that specifies the modeling elements used within another (or the same) modeling notation [43]. 151 Figure 5. Metamodel of software entity this module is the ability to interpret the errors messages when a software entity is being dynamically compiled/loaded, since these messages are useful information for the corrections in the source code. Finally, it is important to emphasize that this process must be performed automatically by software engineering tools, intending to reduce implementation complexity and minimize uncertainties generation. Thus, RA4SaS intends to emerge as a feasible solution for software adaptation at runtime. D. Step RA-4: Reference Architecture Evaluation Aiming at improving the quality of RA4SaS, an inspection based on checklist was conducted. This checklist makes possible to look for defects related to omission, ambiguity, inconsistence and incorrect information that can be present in the architecture. Besides that, aiming at observing the viability of RA4SaS, as well as its capability to develop SaS, case studies were conducted, and the results obtained are presented in the next section. IV. C ASE S TUDY Before starting the presentation of this case study, two considerations must be emphasized. The first one is that the RA4SaS was instantiated by using Java programming language9 , as it meets the requirements presented in Section III. For space reasons, the implementation details are not presented in this paper. The second one refers to adaptation process existing in RA4SaS. This process represents a sequence of steps (orchestration of modules) so that a software entity is adapted. Figure 6 shows the adaptation process of RA4SaS. The adaptation process of RA4SaS presented in Figure 6 is organized in nine steps, which are executed as an “assembly line”. Initially (Step A), the adaptation level supported by the software entity is verified, since it will be used in making the metamodel (Step C). Step B is responsible for preserving the current execution state of software entity at that moment. 9 http://www.oracle.com/us/technologies/java/overview/index.html Figure 6. RA4SaS Adaptation Process This information will be reused in Step I, if the software entity has not changed domain (goal). In Step C, the software entity is “disassembled” and a metamodel is instantiated. This metamodel has the structural and behavioral information of software entity, besides its adaptation level obtained in Step A. Step D basically consists of establishing an action plan for entity adaptation. However, this is a complex, macro activity (dotted line) that is formed by four steps (D, E, F, G). The action plan should establish adaptation criteria so that the entity goals do not become unviable (preventing future adaptations). In order to achieve it, the rule base must be used (Step E), since it provides guidelines regarding the feasibility of adaptation and goals. The procedure of adaptation (action plan) must be established, based on the structural and behavioral requirements (Step E) and 5W1H model (Section III-C). Based on this process and adaptation information, a new metamodel is generated (Step G), which will be transferred to the source code module. Finally, the source code of software entity is generated (Step H) and compiled so that it can be inserted into the execution environment (Step I). On this last step, it is worth mentioning that if the entities changes do not impact on the domain change (goals), the preserved information (Step B) are reinserted in these new entity to replace the old ones in the execution environment. The replacement of entity is performed transparently to its stakeholders, since they have no perception of the changes in relation to the preservation of the execution 152 status and the new generated instances. After illustrated the adaptation process, we will present the case study. As previously mentioned, the software entities adaptation can occur in both structural and/or behavioral changes. For reasons of space, a structural adaptation was chosen, since it is considered sufficient to show the applicability of RA4SaS in the adaptation of a software entity. The chosen example shows the adaptation at two levels: (i) new functionalities association, which corresponds to the addition of new information (classes) through the following relationships: aggregation, composition or association. In this paper, the composition relationship will used. (ii) extension of new functionalities, which corresponds to the addition of new information (classes) through the inheritance relationship. Figure 7 shows the UML model for software entities and main fragments used in the two levels of adaptation. To illustrate the new functionalities association, it will be considered the modification of Customer software entity (UML model - Lane a). Initially it was developed to act in a local system and it will be adapted to act in a web system with authentication (both are information systems for the bookstore management). Based on this context, to adapt this entity, a Login entity with two attributes (username and password) must be created and added through the composition relationship. To make these changes, a metamodel of this entity (Lane a) was instantiated by reflection module Figure 7. Adaptation of customer software entity so that the changes (Login entity) to be incorporated into that. In lines 2 to 5 (Lane b) shows the creation of the username attribute with the following items: access modifier, type, and attribute name. The password attribute creation was omitted, but it was created in a similar way. After that, these attributes were added to the Clazz class (lines 8 to 9). Note that the constructor (line 7) class defines the entity name in the metamodel. Finally, in the line 11, it can be seen as is formed the composition relationship between the entities. The first parameter (line 11) represents the entity that is being created. The second and third parameters indicate the name of the metamodel entity that will associate with the new entity (first parameter), and cardinality between these entities, respectively. The fourth parameter indicates the type of relationship (composition) and navigability of this relationship (Person to Login). The UML model (Lane b) shows the Customer software entity after adaptation process. To illustrate the extension of new functionalities, it will be considered the modification of new Customer software entity (UML model - Lane b). This entity acts in a web system with authentication for bookstore management and it will be adapted to act in school management system. Similarly to the previous adaptation level, the metamodel was instantiated so that a new entity was added. Initially, between lines 2 and 5 (Lane c), it is possible to observe the creation of academicRecord attribute that will be inserted into Student entity (line 7). Finally, in the line 9, it can be seen as is formed the relationship of inheritance between the entities. The first parameter represents the entity that is being created. The second parameter indicates the name of the metamodel entity that will associate with the new entity (first parameter). The third and fourth parameters indicate that there is not 153 cardinality, and the type of relationship (inheritance) between these entities, respectively. Finally, the UML model (Lane c) shows the Student software entity after adaptation process. V. C ONCLUSION AND F UTURE W ORK This paper presented RA4SaS that intends to support the development of SaS. By using reflection, software entities developed using this architecture are transparently monitored and adapted at runtime, without the stakeholders’ perception. To perform these operations, this architecture proposes to use modules working in an “assembly line”, i.e., a software entity is disassembled, adapted, and reassembled automatically by these modules. The case study presented in this paper shows good perspectives to apply structural adaptation of software entities through automatic mechanisms. In addition, it is noteworthy that behavioral adaptation is also supported by SaS developed using RA4SaS. Based on this scenario, the main contributions of this paper are: (i) for the SaS area by providing a means that facilitates the development of SaS; (ii) for the Reference Architecture area by proposing a first reference architecture based on reflection; (iii) for the Software Automation area, since the RA4SaS makes it possible to develop software entities adaptable thought assembly line. As future work, three goals are intended: (i) conduction of more case studies intending to completely evaluate RA4SaS; (ii) instantiation of RA4SaS for other programming languages; and (iii) use of RA4SaS in the industry, since it is intended to evaluate the behavior of RA4SaS when it is applied in larger environments of development and execution. Therefore, it is expected a positive scenario of research, intending to become RA4SaS an effective contribution to the software development community. ACKNOWLEDGMENT [22] This work is supported by PROPe/UNESP and Brazilian funding agencies (FAPESP, CNPq and CAPES). [23] [24] R EFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] P. Maes, “Concepts and experiments in computational reflection,” in OOPSLA 1987, ser. OOPSLA ’87. New York, NY, USA: ACM, 1987, pp. 147–155. G. Coulson, G. Blair, and P. Grace, “On the performance of reflective systems software,” in PCCC 2004, 2004, pp. 763 – 769. Y. Shi, L. ZaoQing, W. JunLi, and W. FuDi, “A reflection mechanism for reusing software architecture,” in QSIC 2006, oct. 2006, pp. 235 –243. Y. Peng, Y. Shi, J. Xiang-Yang, Y. Jun-Feng, L. Ju-Bo, and Y. Wen-Jie, “A reflective information model for reusing software architecture,” in CCCM/ISECS 2008, vol. 1, 2008, pp. 270 –275. B. Morin, O. Barais, J.-M. Jezequel, F. Fleurey, and A. Solberg, “Models@ run.time to support dynamic adaptation,” Computer, vol. 42, no. 10, pp. 44–51, oct. 2009. X. Hongzhen and Z. Guosun, “Retracted: Specification and verification of dynamic evolution of software architectures”,” Journal of Systems Architecture, vol. 56, no. 10, pp. 523 – 533, 2010. S. Vinoski, “A time for reflection [software reflection],” Internet Computing, IEEE, vol. 9, no. 1, pp. 86 – 89, jan.-feb. 2005. J. Andersson, R. de Lemos, S. Malek, and D. Weyns, “Reflecting on self-adaptive software systems,” in SEAMS/ICSE 2009, may 2009, pp. 38 –47. M. Salehie and L. Tahvildari, “Self-adaptive software: Landscape and research challenges,” ACM Trans. Auton. Adapt. Syst., vol. 4, no. 2, pp. 1–42, May 2009. F. J. Affonso and E. L. L. Rodrigues, “A proposal of reference architecture for the reconfigurable software development,” in SEKE 2012, 2012, pp. 668–671. J. Whitehead, “Collaboration in software engineering: A roadmap,” in FOSE 2007. Washington, DC, USA: IEEE Computer Society, 2007, pp. 214–225. E. Y. Nakagawa, F. Oquendo, and M. Becker, “RAModel: A reference model of reference architectures,” in ECSA/WICSA 2012, Helsinki, Finland, 2012, pp. 297–301. S. G. R. High, S. Kinder, “Ibm’s soa foundation - an architectural introduction and overview,” IBM, Tech. Rep., 2005. [Online]. Available: http://www.ginkgo-networks.com/IMG/pdf/AC Blueprint White Paper V7.pdf S. Angelov and P. Grefen, “An e-contracting reference architecture,” J. Syst. Softw., vol. 81, no. 11, pp. 1816–1844, Nov. 2008. A. Grosskurth and M. Godfrey, “A reference architecture for web browsers,” in ICSM 2005, Sept., pp. 661–664. E. Y. Nakagawa, F. C. Ferrari, M. M. Sasaki, and J. C. Maldonado, “An aspect-oriented reference architecture for software engineering environments,” Journal of Systems and Software, vol. 84, no. 10, pp. 1670 – 1684, 2011. S. Angelov, P. Grefen, and D. Greefhorst, “A classification of software reference architectures: Analyzing their success and effectiveness,” in WICSA/ECSA 2009, 2009, pp. 141–150. J. Bayer, T. Forster, D. Ganesan, J.-F. Girard, I. John, J. Knodel, R. Kolb, and D. Muthig, “Definition of reference architectures based on existing systems,” Fraunhofer IESE, Tech. Rep., 2004, technical Report 034.04/E. E. Y. Nakagawa, R. M. Martins, K. R. Felizardo, and J. C. Maldonado, “Towards a process to design aspect-oriented reference architectures,” in CLEI 2009, 2009, pp. 1–10. M. Galster and P. Avgeriou, “Empirically-grounded reference architectures: a proposal,” in QoSA/ISARCS ’11, ser. QoSA-ISARCS ’11. New York, NY, USA: ACM, 2011, pp. 153–158. S. Angelov, P. Grefen, and D. Greefhorst, “A framework for analysis and design of software reference architectures,” Information and Software Technology, vol. 54, no. 4, pp. 417 – 431, 2012. 154 [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] D. Weyns, S. Malek, and J. Andersson, “Forms: a formal reference model for self-adaptation,” in ICAC 2010, ser. ICAC ’10. New York, NY, USA: ACM, 2010, pp. 205–214. J. Kramer and J. Magee, “Self-managed systems: an architectural challenge,” in FOSE 2007, may 2007, pp. 259 –268. L. Liu, S. Thanheiser, and H. Schmeck, “A reference architecture for self-organizing service-oriented computing,” in ARCS 2008, ser. Lecture Notes in Computer Science, U. Brinkschulte, T. Ungerer, C. Hochberger, and R. Spallek, Eds. Springer Berlin / Heidelberg, 2008, vol. 4934, pp. 205–219. D. Weyns, S. Malek, and J. Andersson, “On decentralized selfadaptation: lessons from the trenches and challenges for the future,” in SEAMS/ICSE 2010, ser. SEAMS ’10. New York, NY, USA: ACM, 2010, pp. 84–93. [Online]. Available: http: //doi.acm.org/10.1145/1808984.1808994 M. Erradi, G. Bochmann, and I. Hamid, “Dynamic modifications of object-oriented specifications,” in CompEuro ’92, 1992, pp. 654–659. E. Borde, G. Haı̈k, and L. Pautet, “Mode-based reconfiguration of critical software component architectures,” in DATE 2009, ser. DATE ’09. 3001 Leuven, Belgium, Belgium: European Design and Automation Association, 2009, pp. 1160–1165. X. Chen, “Extending rmi to support dynamic reconfiguration of distributed systems,” in ICDCS 2002, 2002, pp. 401 – 408. E. Tanter, R. Toledo, G. Pothier, and J. Noyé, “Flexible metaprogramming and aop in java,” Sci. Comput. Program., vol. 72, no. 1-2, pp. 22–30, Jun. 2008. A. Janik and K. Zielinski, “Aaop-based dynamically reconfigurable monitoring system,” Inf. Softw. Technol., vol. 52, no. 4, pp. 380–396, Apr. 2010. J. Kephart and D. Chess, “The vision of autonomic computing,” Computer, vol. 36, no. 1, pp. 41–50, 2003. B. Kitchenham, “Procedures for performing systematic reviews,” Keele University and National ICT Australia Ltd, Tech. Rep. TR/SE-0401 and NICTA Technical Report 0400011T.1, Jul 2004. B. Kitchenham and S. Charters, “Guidelines for performing systematic literature reviews in software engineering,” Keele University and Durham University Joint Report, Tech. Rep. EBSE 2007001, 2007, avaliable: http://www.dur.ac.uk/ebse/resources/Systematicreviews-5-8.pdf. O. Dieste, M. Lopez, and F. Ramos, “Formalizing a systematic review updating process,” in SERA 2008, 2008, pp. 143–150. E. Y. Nakagawa and L. B. R. de Oliveira, “Using systematic review to elicit requirements of reference architectures,” in WER 2011, 2011. J. Han and A. Colman, “The four major challenges of engineering adaptive software architectures,” in COMPSAC 2007, vol. 2, 2007, pp. 565–572. J. S. Bradbury, J. R. Cordy, J. Dingel, and M. Wermelinger, “A survey of self-management in dynamic software architecture specifications,” in WOSS 2004, ser. WOSS ’04. New York, NY, USA: ACM, 2004, pp. 28–33. N. M. Villegas, H. A. Müller, G. Tamura, L. Duchien, and R. Casallas, “A framework for evaluating quality-driven self-adaptive software systems,” in SEAMS/ICSE 2011, ser. SEAMS ’11. New York, NY, USA: ACM, 2011, pp. 80–89. ISO/IEC, ISO/IEC 9126. Software engineering – Product quality. ISO/IEC, 2001. M. Patterns and P. Team, Microsoft Application Architecture Guide (Patterns & Practices), 2nd ed., ser. Microsoft Press Series, M. Patterns and P. Team, Eds. Microsoft Press, November 2009, vol. 1. S. Dobson, S. Denazis, A. Fernández, D. Gaı̈ti, E. Gelenbe, F. Massacci, P. Nixon, F. Saffre, N. Schmidt, and F. Zambonelli, “A survey of autonomic communications,” ACM Trans. Auton. Adapt. Syst., vol. 1, no. 2, pp. 223–259, Dec. 2006. IBM, “An architectural blueprint for autonomic computing,” on-line, 2005, white Paper, Third Edition. [Online]. Available: Avaliable:http://www.ginkgo-networks.com/IMG/pdf/AC Blueprint White Paper V7.pdf IEEE-Vocabulary, “Systems and software engineering – vocabulary,” ISO/IEC/IEEE 24765:2010(E), pp. 1–418, 2010.