2023-12-31

2023年振り返り

2023年の振り返りとしてやったことをまとめる。

目標管理

例年通り、四半期ごとの見直し、月次の進捗確認で運用をした。 1,4,7,10月に目標の見直しを行い次四半期の目標を立て、2,3,5,6,8,9,11,12月は進捗を確認する運用。ただし今年は目標管理のモチベーションが低く、あまり見直しはしなかった。新しい目標を達成するというよりは、淡々と、コツコツと積み上げた形になった。昨年秋に始めた朝読書を一日も欠かさなかったし、完全に習慣化した。

論文

1月

Culnane, Chris, Benjamin IP Rubinstein, and Vanessa Teague. "Health data in an open world." arXiv preprint arXiv:1712.05627 (2017). https://arxiv.org/pdf/1712.05627.pdf
Narayanan, Arvind. "An adversarial analysis of the reidentifiability of the heritage health prize dataset." Unpublished manuscript (2011). https://www.cs.princeton.edu/~arvindn/publications/heritage-health-re-identifiability.pdf

4月

Li, Ninghui, Wahbeh Qardaji, and Dong Su. "On sampling, anonymization, and differential privacy or, k-anonymization meets differential privacy." Proceedings of the 7th ACM Symposium on Information, Computer and Communications Security. 2012. https://arxiv.org/pdf/1101.2604.pdf
Li, Ninghui, Wahbeh H. Qardaji, and Dong Su. "Provably private data anonymization: Or, k-anonymity meets differential privacy." CoRR, abs/1101.2604 49 (2011): 55. https://www.cerias.purdue.edu/assets/pdf/bibtex_archive/2010-24-report.pdf

ざっと読み、抜粋読み

1月

Asghar, Hassan Jameel, Paul Tyler, and Mohamed Ali Kaafar. "Differentially private release of public transport data: The opal use case." arXiv preprint arXiv:1705.05957 (2017). https://arxiv.org/pdf/1705.05957.pdf
Sweeney, Latanya. "Weaving technology and policy together to maintain confidentiality." The Journal of Law, Medicine & Ethics 25.2-3 (1997): 98-110. https://latanyasweeney.org/JLME.pdf

4月

Ghazi, Badih, et al. "Algorithms with More Granular Differential Privacy Guarantees." arXiv preprint arXiv:2209.04053 (2022). https://arxiv.org/pdf/2209.04053.pdf
Desfontaines, Damien, et al. "Differential privacy with partial knowledge." arXiv preprint arXiv:1905.00650 (2019). https://arxiv.org/pdf/1905.00650.pdf
Li, Ninghui, Wahbeh Qardaji, and Dong Su. "On sampling, anonymization, and differential privacy or, k-anonymization meets differential privacy." Proceedings of the 7th ACM Symposium on Information https://arxiv.org/pdf/1101.2604.pdf

5月

Ohm, Paul. "Broken promises of privacy: Responding to the surprising failure of anonymization." UCLA l. Rev. 57 (2009): 1701. http://www.lawlib.zju.edu.cn/attachments/file/20201118/20201118174834_66017.pdf
Ohm, Paul. "Sensitive information." S. Cal. L. Rev. 88 (2014): 1125. https://southerncalifornialawreview.com/wp-content/uploads/2018/01/88_1125.pdf

6月

Syomantak Chaudhuri and Thomas A. Courtade. "Mean Estimation Under Heterogeneous Privacy: Some Privacy Can Be Free" arXiv preprint arXiv:2305.09668 (2023). https://arxiv.org/pdf/2305.09668.pdf

読む本数が少なすぎるが、朝読書と違って習慣化していないのがよくない。週末は趣味の活動もあるのでうまくバランスを取りながら、数分でもいいのでコツコツと読む習慣を作るのが来年のアクション。

書籍

漫画を含めた書籍: 42 冊（前年比+13冊 from 29冊）

特に読んで良かった本はデータ指向アプリケーションデザイン、考える技術・書く技術、イシューからはじめよ、ナチスは「良いこと」もしたのか？特に『ナチスは「良いこと」もしたのか？』については、歴史研究における事実/解釈/意見の三層構造に関する考え方が述べられるまえがきでもいいので、必読であった。

技術書

データ匿名化手法 ―ヘルスデータ事例に学ぶ個人情報保護（Khaled El Emam,Luk Arbuckle/オライリージャパン）
データ指向アプリケーションデザイン ―信頼性、拡張性、保守性の高い分散システム設計の原理（Martin Kleppmann/オライリージャパン）
型システム入門 −プログラミング言語と型の理論−（Benjamin C. Pierce/オーム社）

ビジネス書・趣味

解像度を上げる――曖昧な思考を明晰にする「深さ・広さ・構造・時間」の４視点と行動法（馬田隆明/英治出版）
新版　考える技術・書く技術　問題解決力を伸ばすピラミッド原則（バーバラ・ミント/ダイヤモンド社）
イシューからはじめよ――知的生産の「シンプルな本質」（安宅和人/英治出版）
スタッフエンジニア　マネジメントを超えるリーダーシップ（Will Larson/日経BP）
10年戦えるデータ分析入門 SQLを武器にデータ活用時代を生き抜く (Informatics &IDEA)（青木峰郎/SBクリエイティブ）
ビジネスダッシュボード設計・実装ガイドブック成果を生み出すデータと分析のデザイン（トレジャーデータ,池田俊介,藤井温子,櫻井将允,花岡明/翔泳社）
おそろしいビッグデータ超類型化AI社会のリスク（山本龍彦/朝日新書/朝日新聞出版）
検証ナチスは「良いこと」もしたのか？（小野寺拓也,田野大輔/岩波ブックレット 1080/岩波書店）
ヒトラーの脱走兵-裏切りか抵抗か、ドイツ最後のタブー（對馬達雄/中公新書/中央公論新社）
批評理論入門―『フランケンシュタイン』解剖講義 (廣野由美子/中公新書/中央公論新社）
夏への扉（ロバート・A. ハインライン/ハヤカワ文庫SF/早川書房）
孤島の鬼（江戸川乱歩/創元推理文庫―現代日本推理小説叢書/東京創元社）
流行作家の死（野村胡堂/ゴマブックス）
倒れるときは前のめり（有川ひろ/角川文庫//KADOKAWA）
倒れるときは前のめりふたたび（有川ひろ/角川文庫//KADOKAWA）
アンマーとぼくら（有川ひろ/講談社文庫/講談社）
イマジン?（有川ひろ/幻冬舎文庫あ 34-8/幻冬舎）

記事

昨年に引き続き、英語記事に目を通すようになった。readingスキルの伸びを感じるが、まだ読むのが遅いし、疲れていると英文が頭に入ってこない。以下は今年読んでよかった記事。

2022年振り返り

2022年の振り返りとしてやったことをまとめる。

目標管理

例年通り、四半期ごとの見直し、月次の進捗確認で運用をした。 1,4,7,10月に目標の見直しを行い次四半期の目標を立て、2,3,5,6,8,9,11,12月は進捗を確認する運用。

大項目として以下4つを設け、それぞれ中項目の目標を立てた。中項目をさらに細分化して、四半期ごとに小項目での目標管理をした。括弧内は中項目のうち、達成できた数を記載。

読書（2/3）
健康（1/2）
英語（1/1）
趣味（2/5）

論文

1月

Amin, K., Dick, T., Kulesza, A., Medina, A. M., & Vassilvitskii, S. (2019). Differentially private covariance estimation. Advances in Neural Information Processing Systems, 32(NeurIPS).
Stadler, T., Oprisanu, B., & Troncoso, C. (2020). Synthetic Data -- Anonymisation Groundhog Day.
Rogers, R., Subramaniam, S., Peng, S., Durfee, D., Lee, S., Kancha, S. K., Sahay, S., & Ahammad, P. (2020). LinkedIn’s Audience Engagements API: A Privacy Preserving Data Analytics System at Scale. http://arxiv.org/abs/2002.05839

2月

Cao, Yang and Yoshikawa, Masatoshi and Xiao, Yonghui and Xiong, Li, Quantifying differential privacy under temporal correlations, Data Engineering (ICDE), 2017 IEEE 33rd International Conference on, 821--832, 2017.

3月

Kareem Amin, Jennifer Gillenwater, Matthew Joseph, Alex Kulesza, Sergei Vassilvitskii. "Plume: Differential Privacy at Scale" arXiv preprint arXiv:2201.11603 (2022).

4月

Cangialosi, Frank, et al. "Privid: Practical, privacy-preserving video analytics queries." arXiv preprint arXiv:2106.12083 (2021). https://www.usenix.org/system/files/nsdi22-paper-cangialosi.pdf

5月

Chaudhuri, Kamalika, Jacob Imola, and Ashwin Machanavajjhala. "Capacity bounded differential privacy." Advances in Neural Information Processing Systems 32 (2019). https://proceedings.neurips.cc/paper/2019/file/04df4d434d481c5bb723be1b6df1ee65-Paper.pdf

6月

Adam, Nabil R., and John C. Worthmann. "Security-control methods for statistical databases: a comparative study." ACM Computing Surveys (CSUR) 21.4 (1989): 515-556. https://dl.acm.org/doi/pdf/10.1145/76894.76895
Rowe, Neil C. "Diophantine inferences from statistical aggregates on few-valued attributes." 1984 IEEE First International Conference on Data Engineering. IEEE, 1984.

7月

Denning, Dorothy Elizabeth Robling. Cryptography and data security. Vol. 112. Reading: Addison-Wesley, 1982. https://core.ac.uk/download/pdf/36729637.pdf （6.1-6.4）
Matsumoto, Marin, et al. "Measuring Lower Bounds of Local Differential Privacy via Adversary Instantiations in Federated Learning." arXiv preprint arXiv:2206.09122 (2022). https://arxiv.org/pdf/2206.09122.pdf
Dwork, Cynthia, Nitin Kohli, and Deirdre Mulligan. "Differential privacy in practice: Expose your epsilons!." Journal of Privacy and Confidentiality 9.2 (2019). https://par.nsf.gov/servlets/purl/10217360

9月 - Kifer, Daniel, and Ashwin Machanavajjhala. "No free lunch in data privacy." Proceedings of the 2011 ACM SIGMOD International Conference on Management of data. 2011. https://www.cse.psu.edu/~duk17/papers/nflprivacy.pdf

12月 - El Emam, Khaled, et al. "A systematic review of re-identification attacks on health data." PloS one 6.12 (2011): e28071. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0028071

ざっと読み、抜粋読み

1月

Desfontaines, D., Voss, J., Gipson, B., & Mandayam, C. (2022). Differentially private partition selection. Proceedings on Privacy Enhancing Technologies, 2022(1), 339–352. https://doi.org/10.2478/popets-2022-0017

3月

Asi, Hilal, John Duchi, and Omid Javidbakht. "Element level differential privacy: The right granularity of privacy." arXiv preprint arXiv:1912.04042 (2019).

4月

bo Wang, Hongtao Li, Yina Guo, Xiaoyu Ren, "An Ecient Location Privacy Protection Method for Location-Based Services based on Differential Privacy", Research Square, 2022 . https://assets.researchsquare.com/files/rs-1504387/v1_covered.pdf?c=1649083710
Gorka Abad, Stjepan Picek, V´ıctor Julio Ram´ırez-Dur´an, Aitor Urbieta, "On the Security & Privacy in Federated Learning". arXiv preprint arXiv:2112.05423 (2022). https://arxiv.org/pdf/2112.05423.pdf
Yeom, Samuel, et al. "Privacy risk in machine learning: Analyzing the connection to overfitting." 2018 IEEE 31st computer security foundations symposium (CSF). IEEE, 2018. https://arxiv.org/pdf/1709.01604.pdf
Song, Liwei, and Prateek Mittal. "Systematic evaluation of privacy risks of machine learning models." 30th USENIX Security Symposium (USENIX Security 21). 2021. https://www.usenix.org/system/files/sec21-song.pdf
Nasr, Milad, Reza Shokri, and Amir Houmansadr. "Comprehensive privacy analysis of deep learning: Passive and active white-box inference attacks against centralized and federated learning." 2019 IEEE symposium on security and privacy (SP). IEEE, 2019. https://www.comp.nus.edu.sg/~reza/files/Shokri-SP2019.pdf
Shokri, Reza, et al. "Quantifying location privacy." 2011 IEEE symposium on security and privacy. IEEE, 2011. https://orca.cardiff.ac.uk/37912/1/Quantifying_Location_Privacy.pdf
Ateniese, Giuseppe, et al. "Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers." International Journal of Security and Networks 10.3 (2015): 137-150. https://arxiv.org/pdf/1306.4447.pdf
Murakonda, Sasi Kumar, and Reza Shokri. "Ml privacy meter: Aiding regulatory compliance by quantifying the privacy risks of machine learning." arXiv preprint arXiv:2007.09339 (2020). https://arxiv.org/pdf/2007.09339.pdf
Tseng, Wei-Cheng, Wei-Tsung Kao, and Hung-yi Lee. "Membership Inference Attacks Against Self-supervised Speech Models." arXiv preprint arXiv:2111.05113 (2021).

5月

Tramèr, Florian, et al. "Stealing Machine Learning Models via Prediction {APIs}." 25th USENIX security symposium (USENIX Security 16). 2016. https://www.usenix.org/system/files/conference/usenixsecurity16/sec16_paper_tramer.pdf
Salem, Ahmed, et al. "Ml-leaks: Model and data independent membership inference attacks and defenses on machine learning models." arXiv preprint arXiv:1806.01246 (2018). https://arxiv.org/pdf/1806.01246.pdf
Lindell, Yehuda, and Benny Pinkas. "Privacy preserving data mining." Journal of cryptology 15.3 (2002). https://link.springer.com/content/pdf/10.1007/s00145-001-0019-2.pdf
Canny, John. "Collaborative filtering with privacy via factor analysis." Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval. 2002. https://www.researchgate.net/profile/John-Canny/publication/3948694_Collaborative_Filtering_with_Privacy/links/5581e03708ae6cf036c16ff4/Collaborative-Filtering-with-Privacy.pdf
Stewart, Kathy A., and Albert H. Segars. "An empirical examination of the concern for information privacy instrument." Information systems research 13.1 (2002): 36-49. https://www.researchgate.net/profile/Albert-Segars-2/publication/220079710_An_Empirical_Examination_of_the_Concern_for_Information_Privacy_Instrument/links/5be32f36299bf1124fc2da16/An-Empirical-Examination-of-the-Concern-for-Information-Privacy-Instrument.pdf
Vaidya, Jaideep, and Chris Clifton. "Privacy-preserving k-means clustering over vertically partitioned data." Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. 2003. https://www.cerias.purdue.edu/assets/pdf/bibtex_archive/2003-47.pdf
Lo Piano, S. (2020). Ethical principles in machine learning and artificial intelligence: cases from the field and possible ways forward. Humanities and Social Sciences Communications, 7(1), 1–7. https://doi.org/10.1057/s41599-020-0501-9 https://www.nature.com/articles/s41599-020-0501-9.pdf
Choquette-Choo, Christopher A., et al. "Label-only membership inference attacks." International Conference on Machine Learning. PMLR, 2021. https://arxiv.org/pdf/2007.14321.pdf
Kifer, Daniel, and Ashwin Machanavajjhala. "Pufferfish: A framework for mathematical privacy definitions." ACM Transactions on Database Systems (TODS) 39.1 (2014): 1-36. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.436.2576&rep=rep1&type=pdf
Damien Desfontaines, and Bal´azs Pej´o, "SoK: Differential Privacies A taxonomy of differential privacy variants and extensions", 2019. https://arxiv.org/pdf/1906.01337.pdf
Josh Smith, et al. "Making the Most of Parallel Composition in Differential Privacy", Privacy Enchancing Technologies Symposium (PETS) 2022. https://petsymposium.org/2022/files/papers/issue1/popets-2022-0013.pdf

6月

Cormode, Graham, et al. "Empirical privacy and empirical utility of anonymized data." 2013 IEEE 29th International Conference on Data Engineering Workshops (ICDEW). IEEE, 2013. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.591.8092&rep=rep1&type=pdf
小栗秀暢. "プライバシー保護データ流通のための匿名化手法." システム/制御/情報 63.2 (2019): 51-57. https://www.jstage.jst.go.jp/article/isciesci/63/2/63_51/_pdf
高橋翼. 系列データの匿名化に関する研究. Diss. 筑波大学 (University of Tsukuba), 2014. https://core.ac.uk/download/pdf/56657868.pdf
Ayala-Rivera, Vanessa, et al. "A systematic comparison and evaluation of k-anonymization algorithms for practitioners." Transactions on data privacy 7.3 (2014): 337-370. https://researchrepository.ucd.ie/handle/10197/9109
Dwork, Cynthia, et al. "Calibrating noise to sensitivity in private data analysis." Theory of cryptography conference. Springer, Berlin, Heidelberg, 2006. https://link.springer.com/content/pdf/10.1007/11681878_14.pdf
Nicholas Carlini, et al. "(Certified!!) Adversarial Robustness for Free!" arXiv preprint arXiv:2206.10550 (2022). https://arxiv.org/pdf/2206.10550.pdf

7月

U.S. Department of Commerce (1978), Statistical Policy Working Paper 2: Report on Statistical Disclosure and DisclosureAvoidance Techniques, U.S. Government Printing Office, Washington, DC. https://nces.ed.gov/fcsm/pdf/spwp2.pdf
Kellaris, Georgios, et al. "Differentially private event sequences over infinite streams." Proceedings of the VLDB Endowment 7.12 (2014): 1155-1166. http://people.csail.mit.edu/stavrosp/papers/vldb2014/VLDB14_WDP.pdf

書籍

漫画を含めた書籍: 29 冊（前年比-34冊※）

※今年から雑誌は含めないことにした

技術書

なし

ビジネス書・趣味

映画を早送りで観る人たち～ファスト映画・ネタバレ――コンテンツ消費の現在形～ (光文社新書)
オルレアンの少女 (岩波文庫赤 410-10)
ジャンヌ・ダルク超異端の聖女 (講談社学術文庫)
プロジェクト・ヘイル・メアリー上
プロジェクト・ヘイル・メアリー下
人間ぎらい (新潮文庫)
令和2年改正個人情報保護法の実務対応-Q&Aと事例-

記事

昨年に引き続き、あまり記事を読んでいない。ただし、英語記事に目を通すようになった。以下は今年読んでよかった記事。

Github

昨年の1,249 contributionsに対して、-1,231となった。
Githubで管理していないコードやクエリを書く機会が多かったため。

Scrapbox

1376 → 1611 pages

2022-12-25

dbtを使い始めて数ヶ月が経った雑感

この記事は dbt Advent Calendar 2022 の25日目の記事です。dbtを使い始めて数ヶ月ほど経過したので、これまでの所感を書きます。

はじめに

dbt (data build tool) はデータ処理変換を担うフレームワークです[1]。データエンジニアリングにおいて重要なデータ分析の品質保証をデータモデルの構築、テスト、ドキュメンテーションをいった要素で実現します。とりわけDWHに統合して使われることが意図されており、ETLにおけるTransferの役割を実現してくれます。

Coalece2022の[3]のセッションでも語られていますが、この25年間、ストレージコストが低下したことや、よりデータへのアクセスが容易になったなったことで、足元ではすべての変換を一つのDWHで行うETLからELTへといった流れも出てきました。この流れによって、よりdbtが真価を発揮しやすい環境が整ってきたと言えると思います。

チームでデータ分析

個人的に感じているdbtの本質的な価値はデータ分析のプロセスの管理と効率化です。

特に現代データ分析においてはデータエンジニア一人で分析業務を担い切ることは少なく、チーム・組織として分析を高度化し、顧客により大きな価値を届けられるようなデータ分析プロセスの実現とその管理が求められます。

私が従事している業務でも、構造が同一のクエリを何度も何度も手書きし、それを都度レビューするといったプロセスに課題を感じてしました（何度同じクエリを書くのかといった開発者視点での課題であったり、ほぼ同じクエリだからという心理状態によるレビュー品質の低下といったレビュワー・チーム視点での課題など）。

SQLをチームで書いた経験のある方ならわかると思いますが、クエリのコードレビューには大きな集中力を要します。CTEに分解しながら手元で小さなダミーデータを用意したりしながら実行し、クエリ全体としても問題ないということを考えるレビューは大きな労力です。

弱い心を前に浅いレビューで通してしまったりすることもあり、それが積み重なると属人的なクエリのできあがりです。さらに大きなクエリは依存関係も複雑になりがちで、どの中間テーブルを使うのか、その中間テーブルの品質がどれくらい保証されているのかといった課題が発生しがち（「この中間テーブルって最新ですか？」とテーブル作成者にヒアリングして回ったり）で、結果として品質の低いデータ分析結果を顧客に届けることになりかねません。データやファクトに基づく意思決定を誤った方向に導き、ビジネス上の損失や信頼を毀損する結果にも繋がりえます。

ここまでの話を整理すると、以下3つに課題を感じていました。

依存関係が複雑化し、認知的負荷が大きい
中間テーブルの各カラムが満たすべき制約がわからず、設計が見えない
同じようなクエリを都度レビューしなくてはいけない

dbtでの解決

依存関係が複雑化し、認知的負荷が大きい

この課題については、シンプルに dbt docs コマンドで可視化してくれることが、依存関係の把握を助けてくれます。Coalece2022の[4]のセッションではモデリングパターンを以下4つに分類し、これをリファクタリングするワークショップの様子が届けられていますが、可視化されるとどのパターンに分類されるのかがひと目で分かるので、改善へのステップがエスカレーターになっていることにもdbtというプロダクトの強さを感じます。

ソースから直接データをjoinしている
同じデータセットから何度もjoinしている
依存の依存をjoinしている
成果物にソースを使っている

（[4]の発表中より引用）

中間テーブルの各カラムが満たすべき制約がわからず、設計が見えない

テストが書けることで各カラムにどういった制約が見えるようになります。
unique 、not_null 、accepted_values 、relationships （そのカラムが他テーブルに存在するreferential integrityの担保）といったテストが標準 [5]で用意されていますし、これら標準機能で手が届かないようなテストもdbt-utils [6]に解決策が用意されています（複数カラムを跨ったuniquenessを保証したいときに unique_combination_of_columns などをよく利用します）。これらも実態としてはJinjaマクロなので、さらに物足りないときは自作でき十分な柔軟性を備えていると思います。

そして[2] は、data profilingやdata contract、data pipelinesの文脈での話にはなりますが、データ品質問題に対するアプローチとして以下の流れを提案しています。

品質の低いデータを特定する（異常値など）
許容する値を規定する
データパイプライン上で、顧客に届く前に、エラーを補足する

dbtのテストはこの2.と3.に対しての保証と見ることもできるでしょう。

個人的にはこれに加えてテストを見れば各カラムにどのような制約が掛かっているのか（どういう設計なのか）が理解しやすく、テスト自体がドキュメンテーションになる点も気に入っています。

同じようなクエリを都度レビューしなくてはいけない

dbtのデータ変換はSQLとJinjaを組み合わせて実現されています[7]。Jinja[8]はPython-likeなシンタックスを持つテンプレートエンジンです。dbtで利用できるマクロもJinjaを使って生成できるため、複雑なクエリを汎用化し、再利用性を得ることができます。これによって算術演算が多い等の低レベルのクエリや注意深いレビューが必要なクエリなどを一度十分なレビューをすることで品質を保ったまま再利用可能なクエリを生成することができます。

このJinjaマクロを使った方法で一定の解決をみているものの、マクロのメンテナンスコストも無視できないものだと感じています。モデリングによる解決とマクロでの解決の両軸をどのようにバランスさせていくかは個人的にも興味が向いている点です。「同じようなクエリを都度レビューしなくてはいけない」という課題で対象としているクエリを分解すると「ロジックが共通であるクエリ」と「低レベルのクエリ」があるように感じているので、前者はモデリング＋マクロによる解決、後者はdbt Core v1.3で導入されたPython models [9] による解決が望ましいのではとも思え、このあたりは考えを深めていきたい論点になります（まだ視聴できていないのですが、Coalece2022でのなぜPythonを第二言語としたのかについてのセッション[10]で話されていたりしないかも気になっています）。

最後に

簡単ではありますが、dbtを使い始めて数ヶ月の雑感でした。まだまだ知らないことだらけではありますが、使い始める前に感じていた課題が解消されていてdbtの恩恵を感じています。何より複雑なSQLを書いていたときに比べて、構造化ができている点、しっかりと管理でき品質保証されている点に安心感を持って開発を進めることができています。dbt Advent Calendar 2022 を拝見していても、初めて知ることも多かったり、エコシステム自体が大きく成長しようとしている兆しも感じるところで、2023年もどのように変わっていくのか非常に楽しみな技術です。

References

2021-12-31

2021年振り返り

2021年の振り返りとしてやったことをまとめる。

目標管理

昨年に引き続き、四半期ごとの見直し、月次の進捗確認で運用をした。 1,4,7,10月に目標の見直しを行い次四半期の目標を立て、2,3,5,6,8,9,11,12月は進捗を確認する運用。

大項目として以下5つを設け、四半期ごとに小項目での目標管理をした。括弧内は小項目のうち、達成できた数を記載（次四半期で取り返せたものは達成扱い）。

技術（4/12）
読書（10/17）
健康（6/8）
英語（5/7）
趣味（5/11）
資産運用（2/5）

全体として53%の達成率だった。昨年の 65% に対して大きく減ってしまった。
毎月の目標見直しで「またこの目標に着手できなかった」となることが多く、目標管理自体のモチベが大きく下がる要因となっていたし、精神的な重荷にもなっていた。

来年は以下を意識する。

目標の取捨選択
目標への取り組みを行う時間を何処で取るのかまで含めて計画

論文

日本語の論文や金融機関のレポート、決算資料も読んでいるが、ここでは含めない。

Christodorescu, Mihai, et al. "Towards a Two-Tier Hierarchical Infrastructure: An Offline Payment System for Central Bank Digital Currencies." arXiv preprint arXiv:2012.08003 (2020).
Han, Shujie, et al. "An In-Depth Study of Correlated Failures in Production SSD-Based Data Centers." 19th {USENIX} Conference on File and Storage Technologies ({FAST} 21). 2021.
Bahmani, Raad, et al. "{CURE}: A Security Architecture with CUstomizable and Resilient Enclaves." 30th {USENIX} Security Symposium ({USENIX} Security 21). 2021.
Vasily A. Sartakov, et al. "Spons & Shields: practical isolation for trusted execution" The 17th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments. 2021.
van Schaik, Stephan, et al. "CacheOut: Leaking data on Intel CPUs via cache evictions." 2021 IEEE Symposium on Security and Privacy (SP). IEEE, 2021.
Alexander Sprogø Banks, "Remote Attestation: A Literature Review", 2021.
Purnal, Antoon, et al. "Systematic analysis of randomization-based protected cache architectures." 42th IEEE Symposium on Security and Privacy. Vol. 5. 2021.
Lefeuvre, Hugo, et al. "FlexOS: making OS isolation flexible." Proceedings of the Workshop on Hot Topics in Operating Systems. 2021.
Nider, Joel, and Alexandra Fedorova. "The last CPU." Proceedings of the Workshop on Hot Topics in Operating Systems. 2021.
Lillian Tsai, et al. "Privacy Heroes Need Data Disguises" Proceedings of the Workshop on Hot Topics in Operating Systems. 2021.
Feng, Erhu, et al. "Scalable Memory Protection in the PENGLAI Enclave." 15th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 21). 2021.
Kumar, Sam, David E. Culler, and Raluca Ada Popa. "{MAGE}: Nearly Zero-Cost Virtual Memory for Secure Computation." 15th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 21). 2021.
Farke, Florian M., et al. "Are Privacy Dashboards Good for End Users? Evaluating User Perceptions and Reactions to Google's My Activity." 30th USENIX Security Symposium (USENIX Security 21). 2021.
Julie Haney, et al. "\"It's the Company, the Government, You and I\": User Perceptions of Responsibility for Smart Home Privacy and Security", 30th USENIX Security Symposium (USENIX Security 21). 2021.
Messing, Solomon, et al. "State, Bogdan; Wilkins, Arjun, 2020,” Facebook Privacy-Protected Full URLs Data Set”.", 2020.
Chanyaswad, Thee, et al. "Mvg mechanism: Differential privacy under matrix-valued query." Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications S, 2018.
Abowd, John, et al. "An Uncertainty Principle is a Price of Privacy-Preserving Microdata." Advances in Neural Information Processing Systems 34 (2021).
Costan, Victor, and Srinivas Devadas. "Intel sgx explained." IACR Cryptol. ePrint Arch. 2016.86 (2016): 1-118.
Yoon, Jinsung, Daniel Jarrett, and Mihaela Van der Schaar. "Time-series generative adversarial networks." (2019).
Karwa, Vishesh, and Salil Vadhan. "Finite sample differentially private confidence intervals." arXiv preprint arXiv:1711.03908 (2017).
Balle, Borja, et al. "Hypothesis testing interpretations and renyi differential privacy." International Conference on Artificial Intelligence and Statistics. PMLR, 2020.
Mironov, Ilya, Kunal Talwar, and Li Zhang. "R\'enyi Differential Privacy of the Sampled Gaussian Mechanism." arXiv preprint arXiv:1908.10530 (2019).

書籍

HashHub Researchレポート: 317本（前年比+6本）
漫画や雑誌等を含めた書籍: 63冊（前年比-117冊）

技術書

昨年より1冊増えたが、腰を据えて読めていない感覚がある。

ビジネス書・趣味

今年の課題感として本を読む時間を生活に組み込めなかったことがある。朝読書復活させる。

記事

昨年に引き続き、あまり記事は読んでいない。

以下では、Pocketでお気に入りしたものからリスト化する。いずれも2021年12月31日時点で有効なリンクのみを対象とした。概ね読んだ順であり、「2021年に読んでよかった記事」のため2020年以前に公開されたものを含む。体感、去年よりお気に入りにする頻度が下がった。

Github

f:id:cipepser:20211231171059p:plain

昨年の1,038 contributionsに対して、+211となった。

Scrapbox

765 → 1376 pages

2020-12-31

2020年振り返り

2020年の振り返りとしてやったことをまとめる。

目標管理

四半期ごとの見直し、月次の進捗確認で運用をした。

1,4,7,10月に目標の見直しを行い、次四半期の目標を立てる。
2,3,5,6,8,9,11,12月は進捗を確認する運用だった。

技術（5/9）
読書（9/15）
健康（7/9）
英語（4/7）
趣味（9/12）

全体として65%の達成率だった。

目標管理を開始して3年目であったが、運用が確立され、目標管理シートのフォーマットも定常化した。一方でサボることは少ないながら上述の達成率だった点はネガティブ。特に技術と読書。原因は目標が過大なことなので、2021年は本年のVelocityをベースとした目標策定へ改める。

論文

日本語の論文や金融機関のレポート、決算資料も読んでいるが、ここでは含めない。
隔週で担当しているLayerX Newsletter¹により継続的に論文を読む習慣ができた。一方で開催される会議ベースで読んでいるので、過去の論文を読む習慣を付けることが課題。

Cheng, Raymond, et al. "Ekiden: A platform for confidentiality-preserving, trustworthy, and performant smart contracts." 2019 IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 2019.
Yan, Ying, et al. "Confidentiality Support over Financial Grade Consortium Blockchain." Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 2020.
Russinovich, Mark, et al. "CCF: A framework for building confidential verifiable replicated services." Technical Report MSR-TR-2019-16, Microsoft, 2019.
Orenbach, Meni, et al. "Eleos: ExitLess OS services for SGX enclaves." Proceedings of the Twelfth European Conference on Computer Systems. 2017.
Baumann, Andrew. "Hardware is the new software." Proceedings of the 16th Workshop on Hot Topics in Operating Systems. 2017.
Mahmud, Samin Yaseer, et al. "Cardpliance:PCI DSS Compliance of Android Applications." 29th USENIX Security Symposium (USENIX Security 20). 2020.
Ye, Guixin, et al. "Yet another text captcha solver: A generative adversarial network based approach." Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. 2018.
Konoth, Radhesh Krishnan, et al. "SecurePay: Strengthening Two-Factor Authentication for Arbitrary Transactions.", EuroS&P, 2020.
Lovisotto, Giulio, Simon Eberz, and Ivan Martinovic. "Biometric Backdoors: A Poisoning Attack Against Unsupervised Template Updating." arXiv preprint arXiv:1905.09162 (2019).
Wan, Shengye, et al. "RusTEE: Developing Memory-Safe ARM TrustZone Applications." Annual Computer Security Applications Conference. 2020.
He, Yun, et al. "EnclavePDP: A General Framework to Verify Data Integrity in Cloud Using Intel SGX." 23rd International Symposium on Research in Attacks, Intrusions and Defenses ({RAID} 2020). 2020.
Androulaki, Elli, et al. "Privacy-preserving auditable token payments in a permissioned blockchain system." Proceedings of the 2nd ACM Conference on Advances in Financial Technologies. 2020.
Tamrakar, Sandeep, Jan-Erik Ekberg, and Pekka Laitinen. "On rehoming the electronic id to TEEs." 2015 IEEE Trustcom/BigDataSE/ISPA. Vol. 1. IEEE, 2015.
Narayanan, Vikram, et al. "RedLeaf: Isolation and Communication in a Safe Operating System." 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). 2020.
Jones, Kailani R., et al. "Deploying Android Security Updates: an Extensive Study Involving Manufacturers, Carriers, and End Users." Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security. 2020.
Robert Merget, et al. "Raccoon Attack: Finding and Exploiting Most-Significant-Bit-Oracles in TLS-DH(E)", Real World Crypto. 2021.

書籍

HashHub Researchレポート: 311本
漫画や雑誌等を含めた書籍: 180冊

技術書

技術書を読む数が減った。増やしていくべき。今年は意識的に一年一言語を新規に学ぶ目標を停止したが、来年は復活させる。

ビジネス書

特によかったのは、「失敗の本質」と「決済システムのすべて」の2冊。いずれも重たい本ではあるが、組織、決済システムの文脈では必読書だと感じた。また、百人一首や孔子といった教養も意識した選書となった。2021年は美術と舞台照明・演出に関する本を読みたい。

記事

論文を読む時間が増えたので記事を読む時間は減った。なお、これはポジティブだと捉えている。

以下では、Pocketでお気に入りしたものからリスト化する。いずれも2020年12月31日時点で有効なリンクのみを対象とした。残念ながら「読んでよかった」と思ったもののリンクが切れてしまっている記事もいくつか存在した。
概ね読んだ順であり、「2020年に読んでよかった記事」のため2019年以前に公開されたものを含む。体感、去年よりお気に入りにする頻度が下がった。
ざっと眺めてみるとジャンルは、Rust、CBDC、STO、組織論か。コンテナランタイムなど低レイヤの記事をお気に入りにすることも増え、基礎から重厚に議論を進めている記事を「読んでよかった」と思うようになった。ジャンルについて、改めて自分が何をおもしろいと感じたのか可視化できるので来年も振り返ろう。
リスト上は多くないが、翻訳にかけてもいいので英語や中国語の記事を読むようになった一年であった。

Github

f:id:cipepser:20201231170835p:plain

昨年の873 contributionsに対して、+155となった。

Scrapbox

privateのScrapboxの運用を開始した。

0→765pages

趣味のほうは更に別に存在し、84pagesだった。

頑張って書いてるので、ぜひ購読してください！（宣伝）↩

2020-12-23

【Go】Sodiumで認証付き公開鍵暗号

この記事は Go 2 Advent Calendar 2020 の23日目の記事です。

Sodiumとは

Sodium[1]は使いやすさを目的に開発された暗号学ライブラリです。暗号の誤用に関する研究分野では、誤りの検出や復元のため技術とそれに伴う影響が議論されています。一方で誤用そのものを防ぐような方法についての議論は比較的少数です[13]。 Sodiumは使いやすさからこの課題を解決するライブラリの一つです。

また使いやすさだけでなく、portable、cross-compilable、installableであることも謳われています。 awesome-cryptography[2]で検索してみた所、以下のように多様な言語で利用することができます。

jedisct1/libsodium: A modern, portable, easy to use crypto library.

JavaScript

Java

Kalium: Java binding to the Networking and Cryptography (NaCl) library with the awesomeness of libsodium

PHP

Rust

sodiumoxide/sodiumoxide: Sodium Oxide: Fast cryptographic library for Rust bindings to libsodium

Swift

jedisct1/swift-sodium: Safe and easy to use crypto for iOS and macOS

GoでSodiumを使う

GoでもSodiumを扱うことができます。packageはnacl/ · pkg.go.devです。[1][14]にあるようにSodiumはNaClのforkで、Goはnacl packageとして準標準ライブラリに用意されています。

今回はnacl/boxが提供している、公開鍵を使ったメッセージの認証と暗号化を試してみようと思います。内容は[16]をベースとしています。

実装に移る前にSodiumの特徴や注意点をいくつか述べておきます。
まず、メッセージは暗号化されますが、メッセージの長さは秘匿されません。暗号文の長さから元の平文が推測できてしまうようなアプリケーションには向きません。
また、Sodiumではメッセージにnonceを付与します。メッセージごとに異なるnonceを用いることは呼び出し側の責任です。
その他の注意点としては、[16]にも書かれていますが、メッセージ全体を処理するためにメッセージをメモリ上に保持する必要があるため、小さなメッセージが推奨されています。なお、暗号化のオーバーヘッドも存在しますが、8KB程度のメッセージでは十分に償却されるとのことです。

では本題の実装に入っていきましょう。今回はclientからserverを送信するメッセージの暗号化を行います。

まず、GenerateKeyを用いて、鍵ペアを生成します。認証のためclient、serverの両方で鍵ペアを生成します。

// client
clientPublicKey, clientPrivateKey, err := box.GenerateKey(crypto_rand.Reader)
if err != nil {
    panic(err)
}

// server
serverPublicKey, serverPrivateKey, err := box.GenerateKey(crypto_rand.Reader)
if err != nil {
    panic(err)
}

次にメッセージに付与するnonceを生成します。nonceの長さは24 bytesです。 nonceの使い回しは厳禁です。メッセージごとに生成しましょう。

var nonce [24]byte
if _, err := io.ReadFull(crypto_rand.Reader, nonce[:]); err != nil {
    panic(err)
}

いよいよ暗号化です。暗号化にはSealを用います。暗号化対象のメッセージをmsg := []byte("sodium msg")とし、認証付きで公開鍵暗号を施します。

encrypted := box.Seal(nonce[:], msg, &nonce, serverPublicKey, clientPrivateKey)

clientの秘密鍵clientPrivateKeyだけでなく、serverの公開鍵serverPublicKeyを渡していることに注意してください。復号にはサーバの秘密鍵が必要であるとともに、clientの公開鍵によってclientからのメッセージであることを認証します。

ちなみにnonceは暗号文の先頭に付与されます。筆者の環境で実行した結果は以下のようになりました。 encryptedの先頭24 bytesがnonceとなっていることがわかります。

nonce:     [176 240 87 14 190 92 224 60 245 16 119 163 71 18 1 177 57 118 139 46 141 4 99 117]
encrypted: [176 240 87 14 190 92 224 60 245 16 119 163 71 18 1 177 57 118 139 46 141 4 99 117 65 153 138 25 206 247 18 42 0 162 85 88 223 68 203 63 241 28 35 232 242 176 184 56 97 177]

また上記のlen(encrypted)は50 bytesです。 msg（sodium msg）が10 bytesなので、nonceの24 bytesを差し引きいても16 bytesのオーバーヘッドがあります。このオーバーヘッドは認証のために付与されたもので、暗号文とはオーバーラップしない設計となっています。

では復号してみましょう。実際のアプリケーションではclientからserverへencryptedが送信されてからの復号となりますが、ここではserverで受信済みとして先に進みます。

Sealされたメッセージの復号にはOpenを用います。 Openのシグネチャは以下のようになっています。

func Open(out, box []byte, nonce *[24]byte, peersPublicKey, privateKey *[32]byte) ([]byte, bool)

2つ目の引数に注目いただきたいのですが、nonceが必要です。 encryptedの先頭24 bytesがnonceになっていたことを思い出し、encryptedから取り出します。

var decryptNonce [24]byte
copy(decryptNonce[:], encrypted[:24])

必要な引数がすべて揃ったので、serverの秘密鍵serverPrivateKeyとclientの公開鍵clientPublicKeyを用いてOpenを実行します。

decrypted, _ := box.Open(nil, encrypted[24:], &decryptNonce, clientPublicKey, serverPrivateKey)

正しいserverの秘密鍵とclientの公開鍵があれば、認証と復号に成功し、元のmsgと同一のdecryptedを得ることができます。

まとめ

Sodiumの紹介と、Goで認証付き公開鍵暗号の暗号化/復号を試してみました。認証付きで公開鍵暗号を使えるので正しい相手から受信したことを保証したい場合などに有用でしょう。また利用できる言語が多いことも魅力的ですね。筆者はRustとjsのcompatibilityを手元でも動作確認しました。機会があれば、いつか記事にしたいと思います。Goとのcompatibilityも試してみたいですね。

また、同じようなプロジェクトは他にもMonocypher[17]、Themis[18]、Tink[19]もあるようです。特にTinkはGoogleのプロジェクトであることからも気になっています。APIのインターフェースが違うようなので、使い勝手といった視点も含めて触っておきたいところです。

References

[1] Introduction - libsodium https://doc.libsodium.org/
[2] sobolevn/awesome-cryptography: A curated list of cryptography resources and links. https://github.com/sobolevn/awesome-cryptography
[3] jedisct1/libsodium: A modern, portable, easy to use crypto library. https://github.com/jedisct1/libsodium
[4] adamcaudill/libsodium-net: libsodium for .NET - A secure cryptographic library https://github.com/adamcaudill/libsodium-net
[5] bitbeans/StreamCryptor: Stream encryption & decryption with libsodium and protobuf https://github.com/bitbeans/StreamCryptor
[6] tonyg/js-nacl: Pure-Javascript High-level API to Emscripten-compiled libsodium routines. https://github.com/tonyg/js-nacl
[7] jedisct1/libsodium.js: libsodium compiled to Webassembly and pure JavaScript, with convenient wrappers. https://github.com/jedisct1/libsodium.js
[8] Kalium: Java binding to the Networking and Cryptography (NaCl) library with the awesomeness of libsodium http://abstractj.github.io/kalium/
[9] Halite - Simple PHP Cryptography Library - Paragon Initiative Enterprises https://paragonie.com/project/halite
[10] scrothers/libsodium-laravel: Laravel integration for libsodium https://github.com/scrothers/libsodium-laravel
[11] sodiumoxide/sodiumoxide: Sodium Oxide: Fast cryptographic library for Rust (bindings to libsodium) https://github.com/sodiumoxide/sodiumoxide
[12] jedisct1/swift-sodium: Safe and easy to use crypto for iOS and macOS https://github.com/jedisct1/swift-sodium
[13] Blochberger, Maximilian, Tom Petersen, and Hannes Federrath. "Mitigating Cryptographic Mistakes by Design." Mensch und Computer 2019-Workshopband (2019). https://svs.informatik.uni-hamburg.de/publications/2019/2019-09-05-crypto-api-design-muc2019.pdf
[14] NaCl (ソフトウェア) - Wikipedia https://ja.wikipedia.org/wiki/NaCl_(%E3%82%BD%E3%83%95%E3%83%88%E3%82%A6%E3%82%A7%E3%82%A2)
[15] nacl/ · pkg.go.dev https://pkg.go.dev/golang.org/x/crypto/nacl
[16] box · pkg.go.dev https://pkg.go.dev/golang.org/x/crypto@v0.0.0-20201217014255-9d1352758620/nacl/box
[17] Monocypher: Boring crypto that simply works, https://monocypher.org/
[18] Themis: Cross-platform library for secure data storage, message exchange, socket connections, and authentication, https://www.cossacklabs.com/themis/
[19] google/tink: Tink is a multi-language, cross-platform, open source library that provides cryptographic APIs that are secure, easy to use correctly, and hard(er) to misuse. https://github.com/google/tink

2020-12-05

Rustの可変長引数関数とHListの話

この記事は Rust Advent Calendar 2020 の5日目の記事です。

背景

RFCs#2137にあるようにRustでは、可変長引数関数を直接的に書くことはできません。とはいえ全くできないわけではありません。C言語から可変長引数関数呼び出しを実現するため、stub関数を記述することは可能です。例えば、以下のような関数をRustで実装します。

pub unsafe extern "C" fn func(arg: T, arg2: U, mut args: ...) {
    // do something
}

このような関数はuse extern "C"の中で使われ、unsafeを付与する必要があります。

別の実現方法としてはマクロを利用する案もあり、以下のようなマクロで実現できるでしょう。以下はargとarg2を処理し、再帰的に可変長引数に相当するargsを処理する例です。

macro_rules! func {
    ( $arg:expr, $arg2:expr, $($args:expr), + ) => {
        do_something_for_arg($arg);
        do_something_for_arg1($arg2);
        func!(@inner $($args), +);
    };

    (@inner $tail:expr ) => { $tail };

    (@inner $head:expr, $($cons:expr), + ) => {
        do_something_args($head);
        func!(@inner $($cons), +)
     };
}

HListについて

今回は可変長引数関数を実現する方法の一つとして、HListを紹介します。HlistはHeterogenous Listの略で、値に異なる型を取れるリストです。例えば、[1, "a", true]のような[u32, &str, bool]型の値を持つことができます。
似たようなデータ構造としてはVec<T>やタプルが考えられますが、Vec<T>は同一の型Tを持つ点で異なります。一方、タプルは(u32, &str, bool)のように異なる型を持つこともできますが、HListを用いることでコンパイル時まで要素の数を決めないでいられます。

frunk

RustでHListを扱うために、frunk crateを利用します。frunkはgeneric functional programmingをRustで扱えるようにするためのcrateで、以下をサポートしています。

HLists (heterogeneously-typed lists)
LabelledGeneric, and Generic
Coproduct
Validated (accumulator for Result)
Semigroup
Monoid

今回は上記の中から、HListを利用します。frunk crateのHListはHNilとHConsで表現されます。実装は以下のようになっています。

pub struct HNil;

pub struct HCons<H, T> {
    pub head: H,
    pub tail: T,
}

frunk crateにはhlistマクロが定義されており、hlist![1, "a", true]の返り値の型はHCons<i32, HCons<&str, HCons<bool, HNil>>>となります。このようにHNilとHConsを組み合わせて、HListを表現します。

作るもの

可変長引数を取る関数の代表的な例として、print関数を実装します。使い方をみたほうがイメージが湧きやすいと思うので、先にテストを書きます。

#[test]
fn my_print() {
    use frunk::hlist;
    use super::Printer;

    let args = hlist!["hello ", "world. ", 1, " == ", 2.0, " is ", false];
    let got = args.print();
    assert_eq!(got, "hello world. 1 == 2 is false");
}

let args = hlist!["hello ", "world. ", 1, " == ", 2.0, " is ", false];に注目してください。&str、i32、f64、boolという4つの異なる型を含んだListとなっています。本当はargsを引数に取るような可変長引数関数を直接実装したかったのですが、今回は擬似的にargsに対するメソッドとして実装します。Implementing a Type-safe printf in Rustでは、プレースホルダをFVarとして表現し、引数としてHListを受け取るformat関数を実装しています。ぜひこちらもご覧ください。

実装

まず、Printer traitを定義します。
Stringを返すようなprint関数を持ちます。

pub trait Printer<Args> {
    fn print(&self) -> String;
}

HNilとHConsに対して、Printer traitを満たすように、print関数を実装していきます。

まずは、HNilから実装します。

impl Printer<HNil> for HNil {
    fn print(&self) -> String {
        "".to_string()
    }
}

HNilとなった時点でHListは末尾に到達しているので、空文字を返します。シンプルですね。

一方、HConsは少し複雑です。

impl<T, Args, PrintList> Printer<HCons<T, Args>> for HCons<T, PrintList>
    where
        PrintList: Printer<Args>,
        T: ToString,
{
    fn print(&self) -> String {
        self.head.to_string() + &self.tail.print()
    }
}

Printer<Args>のArgsをHCons<T, Args>としています。Tがhead、Argsがtailです。headはそのままStringに変換したいため、ジェネリック境界としてToStringを課しています。tailは再帰的に処理を進めるため、PrintList: Printer<Args>とし、printを呼び出します。この処理はHNilに到達するまで繰り返されます。

このようにして、HNilとHConsに対して、print関数を実装しました。

動作確認

改めて、今回実装したコード全体は以下のようになります。

use frunk::hlist::{HNil, HCons};

pub trait Printer<Args> {
    fn print(&self) -> String;
}

impl Printer<HNil> for HNil {
    fn print(&self) -> String {
        "".to_string()
    }
}

impl<T, Args, PrintList> Printer<HCons<T, Args>> for HCons<T, PrintList>
    where
        PrintList: Printer<Args>,
        T: ToString,
{
    fn print(&self) -> String {
        self.head.to_string() + &self.tail.print()
    }
}

#[cfg(test)]
mod tests {
    #[test]
    fn my_print() {
        use frunk::hlist;
        use super::Printer;
        
        let args = hlist!["hello ", "world. ", 1, " == ", 2.0, " is ", false];
        let got = args.print();
        assert_eq!(got, "hello world. 1 == 2 is false");
    }
}

テストの実行結果です。

❯ cargo test
   Compiling rust-variadic-artgument v0.1.0 (/Users/cipepser/work/github.com/cipepser/rust-variadic-artgument)
    Finished test [unoptimized + debuginfo] target(s) in 0.47s
     Running target/debug/deps/rust_variadic_artgument-3442cb5a09929580

running 1 test
test tests::my_print ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

   Doc-tests rust-variadic-artgument

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

いい感じですね。

最後に

可変長引数関数の話から始まり、HListでprint関数を実装しました。

今回記事を執筆するにあたり、frunc crateの存在を初めて知りました。CoproductやSemigroup、Monoidも扱えるようなので、折を見て遊んでみようと思います。