Selected Papers
- ``The Effects of STEF in Finely Parallel Multithreaded Processors'',
Proceedings of the First IEEE Symposium on High-Performance Computer Architecture,
January 22-25, 1995, North Carolina, U.S.A. IEEE Computer Society Press, pp318-325.
HPCA95.ps
HPCA95.pdf
- ``Design and Implementation of a Multiple-Instruction-Stream Multiple-Execution-Pipeline Architecture'',
International Conference on Parallel and Distributed Computing and System,
Oct. 19-21, 1995, Washington D.C., U.S.A. pp477-480.
PDCS95.ps
PDCS95.pdf
- ``Using Computer Architecture/Organization at the University of Aizu'',
Second Workshop on Computer Architecture Education,
Feb. 3, 1996, San ose, California, U.S.A. Also see ``Using FPGA for Computer Architecture/Organization Education'',
IEEE Computer Society Technical Committee on Computer Architecture (TCCA) Newsletter,
June 1996, IEEE Computer Society Press, pp31-35.
TCCA96JUNE.pdf
- ``Aizup -- A Pipelined Processor Design and Implementation on XILINX FPGA Chip'',
Proceedings of IEEE Symposium on FPGAs for Custom Computing Machines,
Apr.17-19, 1996, Napa, California, U.S.A. IEEE Computer Society Press, pp98-106.
FCCM96.ps
FCCM96.pdf
- ``A New Non-Restoring Square Root Algorithm and Its VLSI Implementations'',
Proceedings of International Conference on Computer Design -- VLSI in Computers and Processors,
Oct. 7-9, 1996, Austin, Texas, U.S.A. IEEE Computer Society Press, pp538-544.
ICCD96.ps
ICCD96.pdf
- ``Implementation of Single Precision Floating Point Square Root on FPGAs'',
IEEE Symposium on FPGAs for Custom Computing Machines,
Apr.16-18, 1997, Napa, California, U.S.A. IEEE Computer Society Press, pp226-232.
FCCM97.ps
FCCM97.pdf
- ``Parallel-Array Implementations of A Non-Restoring Square Root Algorithm'',
Proceedings of International Conference on Computer Design -- VLSI in Computers and Processors,
Oct. 12-15, 1997, Austin, Texas, U.S.A. IEEE Computer Society Press, pp690-695.
ICCD97.ps
ICCD97.pdf
- ``Memory Centric Interconnection Mechanism for Message Passing in Parallel Systems'',
Third International Conference on Massively Parallel Computing Systems
Broadmoor Hotel, Colorado Springs, Colorado, USA April 6-9, 1998.
MPCS98.pdf
- ``JAViR -- Exploiting Instruction Level Parallelism for JAVA Machine by Using Virtual Registers'',
The Second European Parallel and Distributed Systems Conference
Vienna, Austria, July 1-3, 1998. pp80-86.
EuroPDS98.pdf
- ``A Model for Predicting Utilization of Multiple Pipelines in MTMP Architecture'',
International Journal of Modelling and Simulation,
Volume 18, Issue 3, 1998. pp201-207.
IJMS98.pdf
- ``Parallelism of Java Bytecode Programs and a Java ILP Processor Architecture'',
Australian Computer Science Communications,
Vol.21, No.4, Springer-Verlag Singapore, 1999. pp75-84.
ACAC99.ps
ACAC99.pdf
- ``Cost/Performance Tradeoff of n-Select Square Root Implementations'',
Australian Computer Science Communications,
Vol.22, No.4, IEEE Computer Society Press, 2000, pp9-16.
ACAC2000.pdf
- ``Dual-Cubes: A New Interconnection Network for High-performace Computer Clusters'',
Workshop on Computer Architecture, International Computer Symposium 2000,
December 6-8, 2000, National Chung Cheng University, ChiaYi, Taiwan.
ICS2000.pdf
- ``Exploiting Java Instruction/Thread Level Parallelism with Horizontal Multithreading'',
Australian Computer Science Communications,
Vol.23, No.4, IEEE Computer Society Press, 2001.
ACSAC01.ps
ACSAC01.pdf
- ``Fault-tolerant Routing and Disjoint Paths in Dual-cube: a New Interconnection Network'',
Proceedings of the 2001 International Conference on Parallel and Distributed Systems (ICPADS'2001),
IEEE Computer Society Press, 2001, pp315-322.
ICPADS01.pdf
- ``Algorithms of Routing and Matrix Multiplication on Dualcube'',
Proceedings of the Second International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD'01),
Nagoya, Japan, Aug., 2001, pp422-429.
SNPD01.pdf
- ``Efficient Collective Communications in Dual-cube'',
Proceedings of the Thirteen IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS-2001),
Anaheim, USA, Aug., 2001, pp266-271.
PDCS01.pdf
- ``Metacube -- A New Interconnection Network for Large Scale Parallel Systems'',
Australian Computer Science Communications},
Vol.24, No.3, 2002, Australian Computer Society, pp29-36.
ACSAC02.pdf
- ``Efficient Communication in Metacube: A New Interconnection Network'',
Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (I-SPAN 2002),
Manila, Philippines, May 2002, IEEE Computer Society Press, pp165-170.
ISPAN02.pdf
- ``Multinode Broadcasting in Metacube'',
Proceedings of the 3rd ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD'02),
Madrid, Spain, June 26-28, 2002, pp401-408.
SNPD02.pdf
- ``An Instruction Cache Architecture for Parallel Execution of Java Threads'',
Proceedings of the Fourth International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT'03),
Aug. 27-29, 2003 Chengdu, China. pp180-187.
https://doi.org/10.1109/PDCAT.2003.1236283
- ``Fault-Tolerant Cycle Embedding in Dual-Cube with Node Faulty'',
Proceedings of the Fourth International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT'03),
Aug. 27-29, 2003 Chengdu, China. pp71-78.
https://doi.org/10.1109/PDCAT.2003.1236261
- ``Efficient Collective Communications in Dual-cube'',
The Journal of Supercomputing,
Volume 28, Issue.1, Apr., 2004, pp71-90.
https://link.springer.com/article/10.1023/B:SUPE.0000014803.83151.dc
- ``An Efficient Algorithm for Fault Tolerant Routing Based on Adaptive Binomial-Tree Technique in Hypercubes'', Proceedings of the Fiveth International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT'04), Dec. 8-10, 2004 Singapore. pp196-201. https://doi.org/10.1007/978-3-540-30501-9_45
- ``Binomial-Tree Fault Tolerant Routing in Dual-Cubes with Large Number of Faulty Nodes'', Proceedings of the International Symposium on Computational and Information Sciences (CIS'04). Shanghai, China, December 16-18, 2004. pp51-56. https://doi.org/10.1007/978-3-540-30497-5_9
- ``Adaptive Box-Based Efficient Fault-tolerant Routing in 3D Torus'', Proceedings of the 11th International Conference on Parallel and Distributed Systems (ICPADS 2005). Fukuoka, Japan, July 20 - 22, 2005. pp71-77. https://doi.org/10.1109/ICPADS.2005.64
- ``Fault-Tolerant Cycle Embedding in Dual-Cube with Node Faulty'', International Journal of High Performance Computing and Networking Vol. 3, No. 1, 2005. pp45-53. https://doi.org/10.1109/PDCAT.2003.1236261
- ``Online Adaptive Fault-Tolerant Routing in 2D Torus'', Proceedings of Third International Symposium on Parallel and Distributed Processing and Applications, ISPA 2005. Lecture Notes in Computer Science 3758 Springer 2005, ISBN 3-540-29769-3, Nanjing, China, November 2-5, 2005, pp150-161. https://doi.org/10.1007/11576235_20
- ``An Efficient Distributed Broadcasting Algorithm for Wireless Ad Hoc Networks'', Proceedings of the Sixth International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT'05), Dec. 5-8, 2005 Dalian, China. IEEE Computer Society Press. pp75-79. https://doi.org/10.1109/PDCAT.2005.75
- ``An Efficient Algorithm for Finding an Almost Connected Dominating Set of Small Size on Wireless Ad Hoc Networks'', Proceedings of The Third IEEE International Conference on Mobile Ad-hoc and Sensor Systems (MASS2006), October 9 - 12, 2006 Vancouver, Canada. pp199-205. https://doi.org/10.1109/MOBHOC.2006.278557
- ``K-MCore for Multicasting on Mobile Ad Hoc Networks'', Proceedings of the Seventh International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT'06), Taipei, Taiwan, Dec. 4-7, 2006. pp109-114. https://doi.org/10.1109/PDCAT.2006.76
- ``An Algorithm for Constructing Hamiltonian Cycle in Metacube Networks'', Proceedings of the International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT'07), Adelaide, Australia, December 3-6, 2007. pp285-292. https://doi.org/10.1109/PDCAT.2007.9
- ``Efficient Algorithms for finding a Trunk on a Tree Network and its Applications'', Proceedings of the International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT'07), Adelaide, Australia, December 3-6, 2007. pp355-362. https://doi.org/10.1109/PDCAT.2007.10
- ``K-Trunk and Efficient Algorithms for Finding a K-Trunk on a Tree Network'', Proceedings of the Forty-First Hawai'i International Conference on System Sciences (HICSS-41), Big Island, Hawaii, January 7-10, 2008. https://doi.org/10.1109/HICSS.2008.505
- ``A Distributed Algorithm for Finding a Tree Trunk and Its Application for Multicast in Mobile Ad Hoc Networks'', Proceedings of The IEEE 22nd International Conference on Advanced Information Networking and Applications (AINA 2008), March 25 (Tue.) - March 28 (Fri.), 2008, GinoWan, Okinawa, Japan, IEEE Computer Society Press, pp106-113. https://doi.org/10.1109/AINA.2008.21
- ``K-tree Trunk and a Distributed Algorithm for Effective Overlay Multicast on Mobile Ad Hoc Networks'', Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (I-SPAN 2008), Sydney, Australia, May 2008, IEEE Computer Society Press, pp53-58. https://doi.org/10.1109/I-SPAN.2008.22
- ``Prefix Computation and Sorting in Dual-Cube'', Proceedings of the 37th International Conference on Parallel Processing (ICPP-08), Portland, Oregon, USA, September 8-12, 2008, IEEE Computer Society Press, pp389-396, https://doi.org/10.1109/ICPP.2008.18
- ``An Effective Structure for Algorithmic Design and a Parallel Prefix Algorithm on Metacube'', Proceedings of the International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT'08), Dunedin, New Zealand, December 1-4, 2008. IEEE Computer Society Press, pp54-61, https://doi.org/10.1109/PDCAT.2008.20
- ``Optimal Algorithms for Finding a Trunk on a Tree Network and its Applications'', The Computer Journal, Vol. 52, No. 2, March 2009, Oxford University Press, pp268-275. https://academic.oup.com/comjnl/article/52/2/268/371461
- ``An Efficient Parallel Sorting Algorithm on Metacube Multiprocessors'', Proceedings of the International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP'09), Taipei, Taiwan, June, 2009. Springer-Verlag LNCS, pp372-383, https://doi.org/10.1007/978-3-642-03095-6_36
- ``Recursive Dual-Net: A New Universal Network for Supercomputers of the Next Generation'', Proceedings of the International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP'09), Taipei, Taiwan, June, 2009. Springer-Verlag LNCS, pp809-820, https://doi.org/10.1007/978-3-642-03095-6_76
- ``The Recursive Dual-net and its Applications'', Proceedings of the 8th international Conference on Advanced Parallel Processing Technologies, Rapperswil, Switzerland, August, 2009. Springer-Verlag LNCS, pp363-374, https://doi.org/10.1007/978-3-642-03644-6_29
- ``Hamiltonian Connectedness of Recursive Dual-Net'', Proceedings of the 9th International Conference on Computer and Information Technology, Xiamen, China, Oct., 2009. IEEE Computer Society Press, pp203-208, https://doi.org/10.1109/CIT.2009.17
- ``Disjoint-Paths and Fault-Tolerant Routing on Recursive Dual-Net'', Proceedings of the International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT'09), Hiroshima, Japan, December 8-11, 2009. IEEE Computer Society Press, pp48-56, https://doi.org/10.1109/PDCAT.2009.27
- ``Recursive Dual-Net: A New Versatile Network for Supercomputers of the Next Generation'', Journal of Chinese Institute of Engineer, Vol. 32, 2009, pp931-938.
- ``A New Presentation of Metacube for Algorithmic Design and Case Studies: Parallel Prefix Computation and Parallel Sorting'', Journal of Chinese Institute of Engineer, Vol. 32, 2009, pp939-950.
- ``Parallel Prefix Computation in the Recursive Dual-Net'', Proceedings of the International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP2010), Busan, Korea, May, 2010. Springer-Verlag LNCS, pp54-64, https://doi.org/10.1007/978-3-642-13119-6_5
- ``Collective Communication in Recursive Dual-Net: An Interconnection Network for High-Performance Parallel Computer Systems of the Next Generation'', Proceedings of the 10th International Conference on Computer and Information Technology, Bradford, UK, June, 2010. IEEE Computer Society Press, pp147-154, https://doi.org/10.1109/CIT.2010.64
- ``Metacube --- a versatile family of interconnection networks for extremely large-scale supercomputers'', The Journal of Supercomputing, Volume 53, issue 2, 2010, pp329-351. http://www.springerlink.com/content/6724375112362qk5/
- ``Node-to-Set Disjoint-Paths Routing in Recursive Dual-Net'', Proceedings of the First International Conference on Networking and Computing , Higashi Hiroshima, Japan, Nov. 17-18, 2010. Conference Publishing Service.
- ``Parallel Sorting on Recursive Dual-Nets'', Proceedings of the 11th International Conference on Computer and Information Technology, Wuhan, China, Dec. 8-11, 2010. IEEE Computer Society Press, pp110-117, https://doi.org/10.1109/PDCAT.2010.56
- ``Finding a Hamiltonian Cycle in a Hierarchical Dual-Net with Base Network of p-Ary q-Cube'', Proceedings of the International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP2011), Melbourne, Australia, Oct., 2011. Springer-Verlag LNCS, pp117-128, https://doi.org/10.1007/978-3-642-24650-0_11
- ``Parallel Prefix Computation and Sorting on a Recursive Dual-Net'',Journal of Information Processing Systems, Vol.7, No.2, 2011, pp271-286, https://doi.org/10.3745/JIPS.2011.7.2.271
- ``Hierarchical Dual-Net: A Flexible Interconnection Network and its Routing Algorithm'', International Journal of Networking and Computing, Vol.2 No.2, 2012, pp234-250, http://www.ijnc.org/index.php/ijnc/article/view/46
- ``Total Exchange Routing on Hierarchical Dual-Nets'', The 10th IFIP International Conference on Network and Parallel Computing, Guiyang, China, Sep. 19-21, 2013, pp179-193, https://doi.org/10.1007/978-3-642-40820-5_16
- ``Disjoint-Path Routing on Hierarchical Dual-Nets'', International Journal of Networking and Computing, Vol.4 No.2, July, 2014, pp260-278, http://www.ijnc.org/index.php/ijnc/article/view/86
- ``On the Efficiency of Round-2: A Simple Algorithm to Further Minimize Connected Dominating Set of Static Protocols in Ad Hoc Networks'', The 10th International Conference on Wireless Communications, Networking and Mobile Computing, Beijing, China, Sep. 26-28, 2014, pp462-467, https://doi.org/10.1049/ic.2014.0146
- ``Fault-Tolerant Routing Algorithms for Hierarchical Dual-Nets with Limited and Arbitrary Number of Faulty Nodes'', International Journal of Networking and Computing, Vol.5 No.2, 2015, pp329-346, http://www.ijnc.org/index.php/ijnc/article/view/111
- ``Computer Principles and Design in Verilog HDL'', John Wiley & Sons, ISBN 978-1-118-84109-9, 2015, 550 pages
- ``On the Improved Implementations and Performance Evaluation of Digit-by-Digit Integer Restoring and Non-Restoring Cube Root Algorithms'', Proceedings of the 2016 International Conference on Computer, Information and Telecommunication Systems, July 6-8, 2016, Kunming, China. pp21-25, https://doi.org/10.1109/CITS.2016.7546386
- ``Adjusting Parameters of k-Ary n-Cube to Achieve Better Cost Performance'', Proceedings of The 14th IEEE International Symposium on Parallel and Distributed Processing with Applications (IEEE ISPA-16), Tianjin, China, 23-26 August, 2016. pp1218-1225. https://doi.org/10.1109/TrustCom.2016.0197
- ``Routing and Broadcasting Algorithms for Generalized-Star Cube'', International Journal of Networking and Computing, Vol.6 No.2, 2016, pp368-394, http://www.ijnc.org/index.php/ijnc/article/view/133
- ``KMS-Cube --- A General Alternative to Hypercubes for Reducing the Node Degree'', Proceedings of the 2017 International Conference on Computer, Information and Telecommunication Systems, July 21-23, 2017, Dalian, China. pp318-322. https://doi.org/10.1109/CITS.2017.8035291
- ``MiKANT: A Mirrored K-Ary N-Tree for Reducing Hardware Cost and Packet Latency of Fat-Tree and Clos Networks'', The 18th IEEE International Conference on Scalable Computing and Communications, October 8-12, 2018, Guangzhou, China. pp1643-1650. (Best Paper Award). https://doi.org/10.1109/SmartWorld.2018.00281
- ``Switch Fault Tolerance in a Mirrored K-Ary N-Tree'', Proceedings of the 2019 International Conference on Computer, Information and Telecommunication Systems, Aug. 28-31, 2019, Beijing, China, pp25-29 (Best Paper Award). https://doi.org/10.1109/CITS.2019.8862137
- ``Mirrored K-Ary N-Tree and its Efficiency of Fault Tolerance'', International Journal of Communication Systems, Vol. 33, Issue 17, November 25, 2020, e4594. First published: 22 August 2020, Wiley Online Library
- ``Link Fault Tolerant Routing Algorithms in Mirrored K-Ary N-Tree Interconnection Networks'', International Journal of Networking and Computing, Vol.11 No.2, 2021, pp140-173, http://www.ijnc.org/index.php/ijnc/article/view/247
- ``Fault Tolerance and Packet Latency of Peer Fat-Trees'', Proceedings of the International Conference on Parallel and Distributed Computing, Applications and Technologies, Dec. 2022, Sendai, Japan, pp413-425, https://doi.org/10.1007/978-3-031-29927-8_32
- ``Verilog HDL Implementation for an RSA Cryptography using Shift-Sub Modular Multiplication Algorithm'', Journal of Information Assurance and Security, Vol. 17, Issue 3, 2022, pp. 113-121, http://www.mirlabs.org/jias/secured/Volume17-Issue3/Paper11.pdf
- ``Hardware Implementations of Elliptic Curve Cryptography Using Shift-Sub Based Modular Multiplication Algorithms'', Cryptography, 7(4), 57, 2023, pp. 1-29, https://doi.org/10.3390/cryptography7040057
- ``An in-depth study of dimension-extended dragonfly interconnection network'', Concurrency and Computation - Practice and Experience, Vol. 36, Issue 27, First published: September 27, 2024, pp. 1-20, https://onlinelibrary.wiley.com/doi/full/10.1002/cpe.8286
- ``Area-Time-Efficient High-Radix Modular Inversion Algorithm and Hardware Implementation for ECC over Prime Fields'', Computers, 13(10), 265, 2024, pp. 1-30, https://www.mdpi.com/2073-431X/13/10/265. Radix-8 modular inversion Verilog HDL code:
modinv_r8.v; testbench:
modinv_r8_tb.v; modular inversion Python code:
modinv12345678.py; elliptic curve Diffie-Hellman (ECDH) key exchange Python code:
ecdh.py.