Open Data Science Conference – São Francisco/2015

Fala galera, no fim de semana dos dias 14 e 15/Novembro estive em São Francisco para acompanhar o evento Open Data Science Conference. Este evento já aconteceu em Boston também em 2015 e a agenda pra 2016 é que aconteça em mais alguns lugares aqui dos Estados Unidos e também vá para Inglaterra e Japão. A localização do evento é ótima, ao lado do aeroporto de São Francisco, isso facilitou demais a logistica pra quem veio de fora.

O evento foi MUITO superior ao que esperava, a estrutura do evento foi impecável. As salas comportaram todos os participantes, workshops e palestras relevantes, interação com os patrocinadores ocorreu naturalmente por serem patrocinadores com importância pro evento e pra nós, participantes. No fim do primeiro dia tivemos um momento pra fazer network (com cerveja na faixa!) no hotel mesmo, e foi interessante conversar com alguns cientistas de dados aqui dos Estados Unidos.

Fiz algumas anotações das palestras/workshops que achei mais interessantes durante o evento, veja abaixo:

Dia 1 – Sábado

Keynote

Brian Granger – Criador do IPython e Project Jupyter

O Jupyter é um compilador/interpretador online para Python (e mais 40 outras linguagens) que permite trabalhar em soluções narrativas com outros profissionais. Parecido com o Knirt (para R), mas ao invés de só renderizar na tela no formato escolhido, ele permite que o script que é escrito seja processador pelo server através do navegador. Jupyter é uma plataforma aberta, desenvolvida pela comunidade e com um time de estrelas por trás, a facilidade do crescimento se dá pelo uso do GitHub como repositório. Empresas como Microsoft, Google, IBM e varias outras estão desenvolvendo soluções para o Jupyter Notebook ou Kernel. JupyterHub permite empresas usar o Jupyter em suas organizações (pelo que entendi, como um repositório privado do github). Num futuro próximo pode ser que mude o nome para Jupyter Workbench, e também estão trabalhando para entregar Real-Time Colaboration, e na tentativa de permitir em uma única janela do navegador ter um console, graficos interativos, e várias outras coisas de UX para os desenvolvedores ficarem só em uma janela e aumentar a produtividade.

O Brian deu um exemplo de dois jornalistas que escrevem sobre ciência no BuzzFeed, que entregam seus experimentos no github e permite que nós possamos reproduzir as pesquisas. Caso queiram ver, este é o link: https://github.com/BuzzFeedNews/everything

 

Palestras

Claudia Perlich – Big Data matou o click | heróis de métricas não celebrados

40% dos clicks são acidentais ou fraudes! Fraudes são fáceis de gerar através de robos (bots) em eventos de conversão, principalmente nos retargets para aumentar o CTR e o contratante pagar mais pelo serviço. O comportamento de robos são mais fáceis de prever do que humanos, nosso comportamento é quase impossível de prever. Em uma análise focada, quando se analisa um dataset atrás de padrão de comportamento, deve-se conhecer o que procura para eliminar problemas gerados por randomização. Quando se tem uma precisão/acurácia de aproximadamente 50%, isso significa uma randomização. O objetivo é conseguir um valor de limiar acima de 70 ou 80%. Uma técnica que pode ajudar, é coletar para análise alguns registros randomicamente dos 1 ou 2% mais importantes do universo que tem, e analisar o comportamento desta amostra. A chance desta técnica apresentar um padrão de comportamento diferente do comportamento geral é muito grande, e diversos estudos mostram que este comportamento pode ser mais próximo do real do que se imagina.

 

Richard Socher, PhD – Metamind.io – Deep Learning for Enterprise

Esta foi a palestra mais foda e que eu mais me impressionei. Richard apresentou diversos problemas do mundo e as soluções que sua empresa criou utilizando Deep Learning. Esta palestra me fez mudar a agenda e participar de um Workshop de Deep Learning no segundo dia do evento ao invés de assistir às palestras que tinha me planejado. Não sei como descrever as técnicas que ele apresentou, e não teve quase nenhuma teoria, só que o reconhecimento de voz e de imagem são complexos para se executar com algoritmos convencionais de Data Science. O uso de técnicas de Deep Learning ajuda muito nestas tarefas. Processar isso é muito custos e usa-se o CPU pra procesamento de texto e a GPU pra processar imagens.

 

Workshop

John Mount / Nina Zumel – Preparando dados para análise usando R: Técnicas avançadas através das básicas

Workshop de 2 horas com diversas técnicas para se trabalhar com dados, limpar e processar as análises. Foi disponibilizado um repositório no github pra gente configurar a máquina antes e acompanhar o workshop com tudo o que foi discutido (repo garfado): https://github.com/diegonogare/PreparingDataWorkshop

 

Dia 2 – Domingo

Workshop

Markus Beissinger – Intro to Deep Learning with Theano and OpenDeep

Deep Learning segmenta a análise em hierarquia, encontrando item a item treinado separadamente para encontrar o objeto que quer depois. Por exemplo, imagine o reconhecimento de um rosto: em Deep Learning a tecnica começa com uma camada reconhecimento elementos separados, como nariz, orelha, olho, boca, etc; na sequencia a proxima camada reconhece um pouco mais de coisas, como as posições de olhos próximos ao nariz. Boca próximo ao nariz. Orelhas ao lado dos olhos, e assim por diante. Por fim, neste exemplo, ele consegue reconhecer um rosto completo em uma terceira camada de classificação.

O processo interno usa algebra linear, porque tudo dentro de Deep Learning é calculado com matrizes e vetores. Um dos calculos mais básicos ainda é a Regressão Logistica para analisar as probabilidades.

Mesmo a Regressão Logistica sendo a estrutura mais básica, uma das mais usadas é a Rede Neural, que possibilita executar muitas Regressões Logisticas em paralelo e encontrar uma melhor solução para a análise que está fazendo com Deep Learning. Muitas outras técnicas, como Convolutional Nets, por exemplo, devem combinar com as redes neurais artificiais para conseguir fazer o reconhecimento de imagem.

Códigos usados para os exemplos, estão disponíveis nestes dois repositórios garfados https://github.com/diegonogare/odsc e https://github.com/diegonogare/OpenDeep

 

Fidan Boylu / Muxi Li – How to build and operationalize data science solutions with Cortana Analytics

Fidan apresentou um overview interessante de AzureML utilizando alguns modelos nativos de algoritmos existentes dentro do AzureML, avançou expandindo as possibilidades utilizando R. Um tour guiado para analisar dados que fizemos upload, com certeza todos conseguiram acompanhar. Na sequência Mixu mostrou integração entre AzureML e scripts em Python, analisando tudo nos Workbooks do Jupyter que possui integração com o AzureML. Esta segunda parte eu não acompanhei, e pelo visto, a galera que estava acompanhando também se perdeu.

PS. Não falaram nada de Cortana Analytics! 🙁

Dá pra acompanhar depois, passo a passo, com os dados que foram disponibilizados neste repositório garfado: https://github.com/diegonogare/Azure-Machine-Learning-Lab

 

Ted Kwartler – Introduction to text mining using R

Existem muitos problemas na interpretação de texto: O dado não é estruturado, expressões são individualistas, implicações culturais, e vários outros fatores. É possível trabalhar com Text Mining de duas formas, uma é usando validação sintatica e outra é usando “bag of words”. O foco desta sessão é trabalhar com o bag of words. A análise de sentimentos é sempre lembrada quando se fala de mineração de texto, o Ted ainda está estudando esta disciplina e mostrou algumas coisas utilizando técnicas de Score baseada em dicionários. Em resumo, várias técnicas foram apresentadas, e podem ser acompanhadas no repositório do Github que garfei, não do github do Ted, e sim de uma pasta no Amazon Drive dele: https://github.com/diegonogare/DataScience/tree/master/Text%20Mining

Compartilhe o post:
RSS
Follow by Email
Facebook
YOUTUBE
YOUTUBE
LinkedIn

Comentários

comments

141 thoughts on “Open Data Science Conference – São Francisco/2015

  1. Thanks so much for providing individuals with an extremely memorable chance to read critical reviews from this site. It’s usually very ideal and also packed with a great time for me and my office co-workers to visit your blog no less than three times every week to learn the newest secrets you will have. And lastly, I’m just certainly fascinated concerning the splendid knowledge you give. Certain two tips in this post are indeed the most suitable I’ve had.

  2. Aw, this was a very nice post. In thought I wish to put in writing like this moreover ?taking time and precise effort to make a very good article?however what can I say?I procrastinate alot and under no circumstances seem to get one thing done.

  3. I simply wanted to appreciate you yet again. I do not know the things I would’ve created without these suggestions shared by you directly on this theme. It has been the fearsome circumstance for me, but noticing a new specialized manner you handled it took me to leap with gladness. I will be happy for your help and thus wish you are aware of a great job you are always carrying out instructing other individuals through your web site. I am sure you haven’t come across all of us.

  4. I am only writing to make you be aware of of the awesome experience my cousin’s daughter enjoyed studying your blog. She noticed such a lot of details, not to mention how it is like to possess a wonderful giving spirit to let many others without difficulty know precisely various grueling issues. You undoubtedly exceeded visitors’ expectations. Many thanks for producing those invaluable, healthy, educational and cool thoughts on this topic to Tanya.

  5. I wanted to put you a tiny remark so as to thank you so much the moment again just for the striking views you’ve shown in this case. It’s really pretty generous with you to offer unreservedly what many people would’ve offered for an e-book to make some bucks for themselves, mostly considering the fact that you could have done it if you desired. These concepts likewise worked to be a easy way to fully grasp that the rest have similar interest just like my very own to understand good deal more in respect of this matter. I think there are thousands of more pleasant situations ahead for folks who see your website.

  6. My wife and i ended up being quite joyous Albert could round up his investigations through the precious recommendations he was given through your web site. It’s not at all simplistic just to find yourself giving freely tips and hints some others could have been selling. And we acknowledge we have got the blog owner to thank because of that. The specific illustrations you’ve made, the simple site menu, the friendships you can make it easier to foster – it’s got all superb, and it’s really leading our son and us know that that issue is interesting, which is wonderfully important. Many thanks for everything!

  7. I wish to show some appreciation to you for rescuing me from this type of instance. Just after exploring through the online world and seeing principles that were not powerful, I believed my life was gone. Being alive minus the answers to the problems you’ve resolved through your entire guide is a critical case, and those that could have badly affected my career if I had not encountered your web blog. Your actual ability and kindness in handling a lot of stuff was useful. I’m not sure what I would’ve done if I hadn’t discovered such a stuff like this. I can also now look ahead to my future. Thanks a lot so much for your skilled and effective guide. I will not be reluctant to recommend your web site to any individual who requires care on this subject.

  8. I simply wanted to thank you very much again. I do not know what I would have undertaken in the absence of the pointers revealed by you on that question. Completely was the fearsome situation in my position, however , considering the skilled tactic you processed that forced me to leap with delight. Now i’m grateful for your assistance and in addition have high hopes you really know what a great job you are always accomplishing educating most people using a web site. I am certain you haven’t got to know any of us.

  9. A lot of thanks for all of your efforts on this web site. Kate really likes making time for investigations and it’s really obvious why. My partner and i know all concerning the compelling medium you render very useful tactics by means of the web blog and attract contribution from other individuals on that concept plus our own princess is truly studying so much. Have fun with the remaining portion of the new year. You’re the one performing a first class job.

  10. I would like to show my love for your generosity giving support to people that really want guidance on this issue. Your personal commitment to passing the solution all-around was wonderfully significant and have surely helped somebody much like me to arrive at their aims. Your entire informative useful information entails much to me and further more to my office workers. Warm regards; from all of us.

  11. My husband and i were very glad that Raymond could finish off his survey while using the ideas he got from your web page. It is now and again perplexing just to always be releasing solutions some other people could have been selling. We consider we have the writer to be grateful to because of that. The main explanations you’ve made, the straightforward web site menu, the friendships your site help promote – it’s mostly astonishing, and it’s facilitating our son and our family understand the subject matter is excellent, and that’s truly serious. Many thanks for everything!

  12. I am just writing to make you be aware of of the incredible encounter our child gained checking the blog. She discovered numerous things, with the inclusion of how it is like to have a great helping style to have the others just fully grasp a number of hard to do subject areas. You actually exceeded visitors’ expectations. Many thanks for coming up with those useful, safe, revealing and in addition cool guidance on this topic to Janet.

  13. You made some clear points there. I did a search on the subject and found most individuals will agree with your site.

  14. Revolutional update of SEO/SMM software “XRumer 16.0 + XEvil”:
    captchas regignizing of Google, Facebook, Bing, Hotmail, SolveMedia, Yandex,
    and more than 8400 another subtypes of captchas,
    with highest precision (80..100%) and highest speed (100 img per second).
    You can connect XEvil 3.0 to all most popular SEO/SMM programms: XRumer, GSA SER, ZennoPoster, Srapebox, Senuke, and more than 100 of other programms.

    Interested? You can find a lot of introducing videos about XEvil in YouTube.
    Good luck!

    XRumer201707

  15. Usually I do not read post on blogs, however I wish to say that this write-up very compelled me to take a look at and do it! Your writing style has been surprised me. Thank you, very great post.

  16. You have made some really good points there. I looked on the net for more info about the issue and found most individuals will go along with your views on this site.

  17. I was recommended this web site by my cousin. I am not sure whether this post is written by him as nobody else know such detailed about my difficulty. You are wonderful! Thanks!

  18. You have made some really good points there. I checked on the web for more info about the issue and found most individuals will go along with your views on this site.

  19. This is a really good tip especially to those new to the blogosphere. Brief but very precise info Many thanks for sharing this one. A must read article!

  20. Keep up the superb piece of work, I read few posts on this website and I believe that your web site is really interesting and has got circles of superb information.

  21. Thank you for the auspicious writeup. It in fact was a amusement account it. Look advanced to far added agreeable from you! However, how could we communicate?

  22. Wow, fantastic blog layout! How long have you been blogging for? you made blogging look easy. The overall look of your website is wonderful, let alone the content!

  23. Revolutional update of SEO/SMM software “XRumer 16.0 + XEvil”:
    captcha breaking of Google, Facebook, Bing, Hotmail, SolveMedia, Yandex,
    and more than 8400 another categories of captcha,
    with highest precision (80..100%) and highest speed (100 img per second).
    You can connect XEvil 3.0 to all most popular SEO/SMM software: XRumer, GSA SER, ZennoPoster, Srapebox, Senuke, and more than 100 of other programms.

    Interested? There are a lot of impessive videos about XEvil in YouTube.
    See you later!

    XRumer201707

  24. This is a great tip especially to those fresh to the blogosphere. Short but very precise information Appreciate your sharing this one. A must read post!

  25. It’а†s actually a nice and useful piece of info. I’а†m happy that you simply shared this helpful info with us. Please keep us up to date like this. Thanks for sharing.

  26. Absolutely NEW update of SEO/SMM package “XRumer 16.0 + XEvil”:
    captchas regignizing of Google, Facebook, Bing, Hotmail, SolveMedia, Yandex,
    and more than 8400 another subtypes of captchas,
    with highest precision (80..100%) and highest speed (100 img per second).
    You can connect XEvil 3.0 to all most popular SEO/SMM software: XRumer, GSA SER, ZennoPoster, Srapebox, Senuke, and more than 100 of other software.

    Interested? You can find a lot of demo videos about XEvil in YouTube.
    See you later 😉

    XRumer20170717

  27. Terrific paintings! This is the kind of info that are supposed to be shared around the net. Disgrace on Google for not positioning this put up higher! Come on over and visit my site. Thank you =)

  28. I simply could not depart your site prior to suggesting that I actually loved the usual information a person supply on your visitors? Is gonna be back often in order to inspect new posts.

  29. I’а†ve read various exceptional stuff right here. Surely worth bookmarking for revisiting. I surprise how lots try you set to produce this sort of great informative internet site.

  30. I am glad for writing to let you understand of the fantastic experience my friend’s child experienced reading through your webblog. She picked up many issues, which include what it’s like to have an incredible coaching style to let a number of people easily gain knowledge of chosen hard to do things. You really exceeded her desires. Thanks for showing such interesting, trusted, educational not to mention cool tips about this topic to Janet.

  31. Sweet blog! I found it while surfing around on Yahoo News. Do you have any suggestions on how to get listed in Yahoo News? I ave been trying for a while but I never seem to get there! Appreciate it

  32. I will immediately clutch your rss feed as I can at find your email subscription link or newsletter service. Do you have any? Kindly permit me recognize in order that I may just subscribe. Thanks.

  33. Absolutely NEW update of SEO/SMM package “XRumer 16.0 + XEvil”:
    captchas solving of Google, Facebook, Bing, Hotmail, SolveMedia, Yandex,
    and more than 8400 another types of captchas,
    with highest precision (80..100%) and highest speed (100 img per second).
    You can connect XEvil 3.0 to all most popular SEO/SMM software: XRumer, GSA SER, ZennoPoster, Srapebox, Senuke, and more than 100 of other programms.

    Interested? You can find a lot of demo videos about XEvil in YouTube.
    See you later 😉

    XRumer20170718

  34. I just wanted to send a quick remark so as to say thanks to you for all of the nice facts you are showing at this website. My time consuming internet research has at the end been recognized with brilliant ideas to talk about with my visitors. I ‘d suppose that most of us readers actually are unquestionably fortunate to exist in a perfect place with very many outstanding individuals with valuable suggestions. I feel somewhat blessed to have encountered your entire site and look forward to some more brilliant moments reading here. Thanks once again for everything.

  35. This unique blog is no doubt educating as well as diverting. I have picked helluva helpful advices out of this amazing blog. I ad love to visit it again and again. Thanks a lot!

  36. Thank you for another wonderful article. Where else could anyone get that kind of info in such a perfect way of writing? I have a presentation next week, and I am on the look for such info.

  37. This awesome blog is definitely awesome additionally factual. I have found helluva useful tips out of this amazing blog. I ad love to go back over and over again. Cheers!

  38. Usually I do not learn article on blogs, but I wish to say that this write-up very pressured me to try and do so! Your writing style has been surprised me. Thank you, quite great post.

  39. It as hard to come by knowledgeable people on this topic, but you seem like you know what you are talking about! Thanks

  40. Normally I don at read post on blogs, however I would like to say that this write-up very forced me to check out and do it! Your writing taste has been amazed me. Thank you, very nice post.

  41. Absolutely NEW update of SEO/SMM package “XRumer 16.0 + XEvil”:
    captcha breaking of Google, Facebook, Bing, Hotmail, SolveMedia, Yandex,
    and more than 8400 another subtypes of captcha,
    with highest precision (80..100%) and highest speed (100 img per second).
    You can connect XEvil 3.0 to all most popular SEO/SMM software: XRumer, GSA SER, ZennoPoster, Srapebox, Senuke, and more than 100 of other programms.

    Interested? You can find a lot of introducing videos about XEvil in YouTube.
    Good luck 😉

    XRumer20170721

  42. magnificent issues altogether, you just gained a emblem new reader. What may you recommend about your publish that you made some days ago? Any certain?

  43. Revolutional update of SEO/SMM package “XRumer 16.0 + XEvil”:
    captcha solving of Google, Facebook, Bing, Hotmail, SolveMedia, Yandex,
    and more than 8400 another size-types of captchas,
    with highest precision (80..100%) and highest speed (100 img per second).
    You can connect XEvil 3.0 to all most popular SEO/SMM programms: XRumer, GSA SER, ZennoPoster, Srapebox, Senuke, and more than 100 of other programms.

    Interested? There are a lot of demo videos about XEvil in YouTube.
    See you later!

    XRumer20170721

  44. Revolutional update of SEO/SMM package “XRumer 16.0 + XEvil 3.0”:
    captcha regignizing of Google, Facebook, Bing, Hotmail, SolveMedia, Yandex,
    and more than 8400 another types of captchas,
    with highest precision (80..100%) and highest speed (100 img per second).
    You can connect XEvil 3.0 to all most popular SEO/SMM programms: XRumer, GSA SER, ZennoPoster, Srapebox, Senuke, and more than 100 of other software.

    Interested? There are a lot of introducing videos about XEvil in YouTube.
    Good luck 😉

    XRumer20170721

  45. Absolutely NEW update of SEO/SMM software “XRumer 16.0 + XEvil 3.0”:
    captchas breaking of Google, Facebook, Bing, Hotmail, SolveMedia, Yandex,
    and more than 8400 another categories of captchas,
    with highest precision (80..100%) and highest speed (100 img per second).
    You can connect XEvil 3.0 to all most popular SEO/SMM software: XRumer, GSA SER, ZennoPoster, Srapebox, Senuke, and more than 100 of other software.

    Interested? You can find a lot of introducing videos about XEvil in YouTube.
    See you later 😉

    XRumer20170721

Deixe uma resposta

O seu endereço de e-mail não será publicado.